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Preface 


Many  different  mathematical  methods  and  concepts  are  used  in  classical 
mechanics:  differential  equations  and  phase  flows,  smooth  mappings  and 
manifolds,  Lie  groups  and  Lie  algebras,  symplectic  geometry  and  ergodic 
theory.  Many  modern  mathematical  theories  arose  from  problems  in 
mechanics  and  only  later  acquired  that  axiomatic-abstract  form  which 
makes  them  so  hard  to  study. 

In  this  book  we  construct  the  mathematical  apparatus  of  classical 
mechanics  from  the  very  beginning;  thus,  the  reader  is  not  assumed  to  have 
any  previous  knowledge  beyond  standard  courses  in  analysis  (differential 
and  integral  calculus,  differentia!  equations),  geometry  (vector  spaces, 
vectors)  and  linear  algebra  (linear  operators,  quadratic  forms). 

With  the  help  of  this  apparatus,  we  examine  all  the  basic  problems  in 
dynamics,  including  the  theory  of  oscillations,  the  theory  of  rigid  body 
motion,  and  the  hamiltonian  formalism.  The  author  has  tried  to  show  the 
geometric,  qualitative  aspect  of  phenomena.  In  this  respect  the  book  is 
closer  to  courses  in  theoretical  mechanics  for  theoretical  physicists  than  to 
traditional  courses  in  theoretical  mechanics  as  taught  by  mathematicians. 

A  considerable  part  of  the  book  is  devoted  to  variational  principles  and 
analytical  dynamics.  Characterizing  analytical  dynamics  in  his  “Lectures  on 
the  development  of  mathematics  in  the  nineteenth  century,”  F.  Klein  wrote 
that  ”...  a  physicist,  for  his  problems,  can  extract  from  these  theories  only 
very  little,  and  an  engineer  nothing.”  The  development  of  the  sciences  in  the 
following  years  decisively  disproved  this  remark.  Hamiltonian  formalism 
lay  at  the  basis  of  quantum  mechanics  and  has  become  one  of  the  most  often 
used  tools  in  the  mathematical  arsenal  of  physics.  After  the  significance  of 
symplectic  structures  and  Huygens’  principle  for  all  sorts  of  optimization 
problems  was  realized,  Hamilton’s  equations  began  to  be  used  constantly  in 
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engineering  calculations.  On  the  other  hand,  the  contemporary  development 
of  celestial  mechanics,  connected  with  the  requirements  of  space  exploration, 
created  new  interest  in  the  methods  and  problems  of  analytical  dynamics. 

The  connections  between  classical  mechanics  and  other  areas  of  mathe- 
matics  and  physics  are  many  and  varied.  The  appendices  to  this  book  are 
devoted  to  a  few  of  these  connections.  The  apparatus  of  classical  mechanics 
is  applied  to:  the  foundations  of  riemannian  geometry,  the  dynamics  of 
an  ideal  fluid,  Kolmogorov’s  theory  of  perturbations  of  conditionally 
periodic  motion,  short-wave  asymptotics  for  equations  of  mathematical 
physics,  and  the  classification  of  caustics  in  geometrical  optics. 

These  appendices  are  intended  for  the  interested  reader  and  are  not  part 
of  the  required  general  course.  Some  of  them  could  constitute  the  basis  of 
special  courses  (for  example,  on  asymptotic  methods  in  the  theory  of  non¬ 
linear  oscillations  or  on  quasi-classical  asymptotics).  The  appendices  also 
contain  some  information  of  a  reference  nature  (for  example,  a  list  of  normal 
forms  of  quadratic  hamiltonians).  While  in  the  basic  chapters  of  the  book  the 
author  has  tried  to  develop  all  the  proofs  as  explicitly  as  possible,  avoiding 
references  to  other  sources,  the  appendices  consist  on  the  whole  of  summaries 
of  results,  the  proofs  of  which  are  to  be  found  in  the  cited  literature. 

The  basis  for  the  book  was  a  year-and-a-half-long  required  course 
in  classical  mechanics,  taught  by  the  author  to  third-  and  fourth-year 
mathematics  students  at  the  mathematics-mechanics  faculty  of  Moscow 
State  University  in  1966-1968. 

The  author  is  grateful  to  I.  G.  Petrovsky,  who  insisted  that  these  lectures 
be  delivered,  written  up,  and  published.  In  preparing  these  lectures  for 
publication,  the  author  found  very  helpful  the  lecture  notes  of  L.  A,  Buni¬ 
movich,  L.  D.  Vaingortin,  V.  L.  Novikov,  and  especially,  the  mimeographed 
edition  (Moscow  State  University,  1968)  organized  by  N.  N.  Kolesnikov.  The 
author  thanks  them,  and  also  all  the  students  and  colleagues  who  communi¬ 
cated  their  remarks  on  the  mimeographed  text;  many  of  these  remarks  were 
used  in  the  preparation  of  the  present  edition.  The  author  is  grateful  to 
M.  A.  Leontovich,  for  suggesting  the  treatment  of  connections  by  means  of  a 
limit  process,  and  also  to  1. 1.  Vorovich  and  V.  I.  Yudovich  for  their  detailed 
review  of  the  manuscript. 
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The  translators  would  like  to  thank  Dr.  R.  Barrar  for  his  help  in  reading 
the  proofs.  We  would  also  like  to  thank  many  readers,  especially  Ted  Courant, 
for  spotting  errors  in  the  first  two  printings. 

Berkeley,  1981  K.  Vogtmann 

A.  Weinstein 


vi 


Preface  to  the  second  edition 


The  main  part  of  this  book  was  written  twenty  years  ago.  The  ideas  and 
methods  of  symplectic  geometry,  developed  in  this  book,  have  now  found 
many  applications  in  mathematical  physics  and  in  other  domains  of  applied 
mathematics,  as  well  as  in  pure  mathematics  itself.  Especially,  the  short-wave 
asymptotical  expansions  theory  has  reached  a  very  sophisticated  level,  with 
many  important  applications  to  optics,  wave  theory,  acoustics,  spectroscopy, 
and  even  chemistry;  this  development  was  parallel  to  the  development  of  the 
theories  of  Lagrange  and  Legendre  singularities,  that  is,  of  singularities  of 
caustics  and  of  wave  fronts,  of  their  topology  and  their  perestroikas  (in 
Russian  metamorphoses  were  always  called  “perestroikas,”  as  in  “Morse 
perestroika”  for  the  English  “Morse  surgery”;  now  that  the  word  perestroika 
has  become  international,  we  may  preserve  the  Russian  term  in  translation 
and  are  not  obliged  to  substitute  “metamorphoses”  for  “perestroikas”  when 
speaking  of  wave  fronts,  caustics,  and  so  on). 

Integrable  hamiltonian  systems  have  been  discovered  unexpectedly  in  many 
classical  problems  of  mathematical  physics,  and  their  study  has  led  to  new 
results  in  both  physics  and  mathematics,  for  instance,  in  algebraic  geometry. 

Symplectic  topology  has  become  one  of  the  most  promising  and  active 
branches  of  “global  analysis.”  An  important  generalization  of  the  Poincare 
“geometric  theorem”  (see  Appendix  9)  was  proved  by  C.  Conley  and 
E.  Zehnder  in  1983.  A  sequence  of  works  (by  M.  Chaperon,  A.  Weinstein,  J.-C. 
Sikorav,  M.  Gromov,  Ja.  M.  Eliashberg,  Ju.  Tchekanov,  A.  Floer,  C.  Viterbo, 
H.  Hofer,  and  others)  marks  important  progress  in  this  very  living  domain. 
One  may  hope  that  this  progress  will  lead  to  the  proof  of  many  known 
conjectures  in  symplectic  and  contact  topology,  and  to  the  discovery  of  new 
results  in  this  new  domain  of  mathematics,  emerging  from  the  problems  of 
mechanics  and  optics. 
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The  present  edition  includes  three  new  appendices.  They  represent  the 
modern  development  of  the  theory  of  ray  systems  (the  theory  of  singularity 
and  of  perestroikas  of  caustics  and  of  wave  fronts,  related  to  the  theory  of 
Coxeter  reflection  groups),  the  theory  of  integrable  systems  (the  geometric 
theory  of  elliptic  coordinates,  adapted  to  the  infinite-dimensional  Hilbert 
space  generalization),  and  the  theory  of  Poisson  structures  (which  is  a  general¬ 
ization  of  the  theory  of  symplectic  structures,  including  degenerate  Poisson 
brackets). 

A  more  detailed  account  of  the  present  state  of  perturbation  theory  may  be 
found  in  the  book,  Mathematical  Aspects  of  Classical  and  Celestial  Mechanics 
by  V.  I.  Arnold,  V.  V.  Kozlov,  and  A.  I.  Neistadt,  Encyclopaedia  of  Math.  Sci., 
Vol.  3  (Springer,  1986);  Volume  4  of  this  series  (1988)  contains  a  survey 
“Symplectic  geometry”  by  V.  I.  Arnold  and  A.  B.  Givental’,  an  article  by 
A.  A.  Kirillov  on  geometric  quantization,  and  a  survey  of  the  modern  theory 
of  integrable  systems  by  S.  P.  Novikov,  I.  M.  Krichever,  and  B.  A.  Dubrovin. 

For  more  details  on  the  geometry  of  ray  systems,  see  the  book  Singularities 
of  Differentiable  Mappings  by  V.  I.  Arnold,  S.  M.  Gusein-Zade,  and  A.  N. 
Varchenko  (Vol.  1,  Birkhauser  1985;  vol.  2,  Birkhauser,  1988).  Catastrophe 
Theory  by  V.  I.  Arnold  (Springer,  1986)  (second  edition)  contains  a  long 
annotated  bibliography. 

Surveys  on  symplectic  and  contact  geometry  and  on  their  applications  may 
be  found  in  the  Bourbaki  seminar  (D.  Bennequin,  “Caustiques  mystiques”, 
February,  1 986)  and  in  a  series  of  articles  (V.  I.  Arnold,  First  steps  of  symplectic 
topology,  Russian  Math.  Surveys,  41  (1986);  Singularities  of  ray  systems, 
Russian  Math.  Surveys,  38  (1983);  Singularities  in  variational  calculus. 
Modern  Problems  of  Math.,  VINITI,  22  (1983)  (translated  in  J.  Soviet  Math.); 
and  O.  P.  Shcherbak,  Wave  fronts  and  reflection  groups,  Russian  Math. 
Surveys,  43  (1988)). 

Volumes  22  (1983)  and  33  (1988)  of  the  VINITI  series,  “Sovremennye 
problemy  mathematiki.  Noveishie  dostijenia,”  contain  a  dozen  articles  on  the 
applications  of  symplectic  and  contact  geometry  and  singularity  theory  to 
mathematics  and  physics. 

Bifurcation  theory  (both  for  hamiltonian  and  for  more  general  systems) 
is  discussed  in  the  textbook  Geometrical  Methods  of  the  Theory  of  Ordinary 
Differential  Equations  (Springer,  1988)  (this  new  edition  is  more  complete  than 
the  preceding  one).  The  survey  “Bifurcation  theory  and  its  applications  in 
mathematics  and  mechanics”  (XVlIth  International  Congress  of  Theoretical 
and  Applied  Mechanics  in  Grenoble,  August,  1988)  also  contains  new  infor¬ 
mation,  as  does  Volume  5  of  the  Encyclopaedia  of  Math.  Sci.  (Springer,  1989), 
containing  the  survey  “Bifurcation  theory”  by  V.  I.  Arnold,  V.  S.  Afraimovich, 
Ju.  S.  Iljashenko,  and  L.  P.  Shilnikov.  Volume  2  of  this  series,  edited  by 
D.  V.  Anosov  and  Ja.  G.  Sinai,  is  devoted  to  the  ergodic  theory  of  dynamical 
systems  including  those  of  mechanics. 

The  new  discoveries  in  all  these  theories  have  potentially  extremely  wide 
applications,  but  since  these  results  were  discovered  rather  recently,  they  are 
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discussed  only  in  the  specialized  editions,  and  applications  are  impeded  by 
the  difficulty  of  the  mathematical  exposition  for  nonmathematicians.  I  hope 
that  the  present  book  will  help  to  master  these  new  theories  not  only  to 
mathematicians,  but  also  to  all  those  readers  who  use  the  theory  of  dynamical 
systems,  symplectic  geometry,  and  the  calculus  of  variations — in  physics, 
mechanics,  control  theory,  and  so  on.  The  author  would  like  to  thank  Dr. 
T.  Tokieda  for  his  help  in  correcting  errors  in  previous  printings  and  for 
reading  the  proofs. 


December  1988 


V.  I.  Arnold 


Translator’s  preface  to  the  second  edition 


This  edition  contains  three  new  appendices,  originally  written  for  inclusion  in 
a  German  edition.  They  describe  work  by  the  author  and  his  co-workers  on 
Poisson  structures,  elliptic  coordinates  with  applications  to  integrable  sys¬ 
tems,  and  singularities  of  ray  systems.  In  addition,  numerous  corrections  to 
errors  found  by  the  author,  the  translators,  and  readers  have  been  incorpo¬ 
rated  into  the  text. 
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PART  I 

NEWTONIAN  MECHANICS 


Newtonian  mechanics  studies  the  motion  of  a  system  of  point  masses 
in  three-dimensional  euclidean  space.  The  basic  ideas  and  theorems  of 
newtonian  mechanics  (even  when  formulated  in  terms  of  three-dimensional 
cartesian  coordinates)  are  invariant  with  respect  to  the  six-dimensional1 
group  of  euclidean  motions  of  this  space. 

A  newtonian  potential  mechanical  system  is  specified  by  the  masses 
of  the  points  and  by  the  potential  energy.  The  motions  of  space  which  leave 
the  potential  energy  invariant  correspond  to  laws  of  conservation. 

Newton’s  equations  allow  one  to  solve  completely  a  series  of  important 
problems  in  mechanics,  including  the  problem  of  motion  in  a  central  force 
field. 


1  And  also  with  respect  to  the  larger  group  of  galilean  transformations  of  space-time. 


Experimental  facts 


In  this  chapter  we  write  down  the  basic  experimental  facts  which  lie  at  the 
foundation  of  mechanics:  Galileo’s  principle  of  relativity  and  Newton’s 
differential  equation.  We  examine  constraints  on  the  equation  of  motion 
imposed  by  the  relativity  principle,  and  we  mention  some  simple  examples. 

1  The  principles  of  relativity  and  determinacy 

In  this  paragraph  we  introduce  and  discuss  the  notion  of  an  inertial  coordinate  system.  The 
mathematical  statements  of  this  paragraph  are  formulated  exactly  in  the  next  paragraph. 

A  series  of  experimental  facts  is  at  the  basis  of  classical  mechanics.2  We 
list  some  of  them. 

A  Space  and  time 

Our  space  is  three-dimensional  and  euclidean,  and  time  is  one-dimensional. 
B  Galileo's  principle  of  relativity 

There  exist  coordinate  systems  (called  inertial)  possessing  the  following 
two  properties: 

1.  All  the  laws  of  nature  at  all  moments  of  time  are  the  same  in  all  inertial 
coordinate  systems. 

2.  All  coordinate  systems  in  uniform  rectilinear  motion  with  respect  to  an 
inertial  one  are  themselves  inertial. 


2  All  these  “experimental  facts”  are  only  approximately  true  and  can  be  refuted  by  more  exact 
experiments.  In  order  to  avoid  cumbersome  expressions,  we  will  not  specify  this  from  now  on 
and  we  will  speak  of  our  mathematical  models  as  if  they  exactly  described  physical  phenomena. 
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1 :  Experimental  facts 


In  other  words,  if  a  coordinate  system  attached  to  the  earth  is  inertial, 
then  an  experimenter  on  a  train  which  is  moving  uniformly  in  a  straight  line 
with  respect  to  the  earth  cannot  detect  the  motion  of  the  train  by  experiments 
conducted  entirely  inside  his  car. 

In  reality,  the  coordinate  system  associated  with  the  earth  is  only  approxi¬ 
mately  inertial.  Coordinate  systems  associated  with  the  sun,  the  stars,  etc. 
are  more  nearly  inertial. 

C  Newton's  principle  of  determinacy 

The  initial  state  of  a  mechanical  system  (the  totality  of  positions  and 
velocities  of  its  points  at  some  moment  of  time)  uniquely  determines  all  of 
its  motion. 

It  is  hard  to  doubt  this  fact,  since  we  learn  it  very  early.  One  can  imagine 
a  world  in  which  to  determine  the  future  of  a  system  one  must  also  know  the 
acceleration  at  the  initial  moment,  but  experience  shows  us  that  our  world 
is  not  like  this. 


2  The  galilean  group  and  Newton’s  equations 

In  this  paragraph  we  define  and  investigate  the  galilean  group  of  space-time  transformations. 
Then  we  consider  Newton’s  equation  and  the  simplest  constraints  imposed  on  its  right-hand  side 
by  the  property  of  invariance  with  respect  to  galilean  transformations.3 

A  Notation 

We  denote  the  set  of  all  real  numbers  by  R.  We  denote  by  R"  an  ji-dimen- 
sional  real  vector  space. 


Figure  1  Parallel  displacement 


Affine  n-dimensional  space  A "  is  distinguished  from  R"  in  that  there  is 
“no  fixed  origin.”  The  group  R"  acts  on  An  as  the  group  of  parallel  displace¬ 
ments  (Figure  1): 


a  ->  a  +  b,  a  e  A",  b  e  R",  a  +  b  e  An. 

[Thus  the  sum  of  two  points  of  An  is  not  defined,  but  their  difference  is  defined 
and  is  a  vector  in  R".] 

3  The  reader  who  has  no  need  for  the  mathematical  formulation  of  the  assertions  of  Section  I 
can  omit  this  section. 
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2:  The  galiliean  group  and  Newton’s  equations 


A  euclidean  structure  on  the  vector  space  IR"  is  a  positive  definite  symmetric 
bilinear  form  called  a  scalar  product.  The  scalar  product  enables  one  to 
define  the  distance 

p(x,  y )  =  ||x  -  y\\  -  yj(x  -  y,x-  y) 

between  points  of  the  corresponding  affine  space  An.  An  affine  space  with  this 
distance  function  is  called  a  euclidean  space  and  is  denoted  by  E". 

B  Galilean  structure 

The  galilean  space-time  structure  consists  of  the  following  three  elements: 

1.  The  universe— a  four-dimensional  affine4  space  A4.  The  points  of  A4 
are  called  world  points  or  events.  The  parallel  displacements  of  the  universe 
A4  constitute  a  vector  space  IR4. 

2.  Time— a  linear  mapping  t:IR4-»  IR  from  the  vector  space  of  parallel 
displacements  of  the  universe  to  the  real  “time  axis.”  The  time  interval 
from  event  a  e  A4  to  event  b  e  A4  is  the  number  t(b  —  a)  (Figure  2).  If 
t(b  —  a)  —  0,  then  the  events  a  and  b  are  called  simultaneous. 


Figure  2  Interval  of  time  t 


The  set  of  events  simultaneous  with  a  given  event  forms  a  three- 
dimensional  affine  subspace  in  A4.  It  is  called  a  space  of  simultaneous 
events  A3. 

The  kernel  of  the  mapping  t  consists  of  those  parallel  displacements  of 
A4  which  take  some  (and  therefore  every)  event  into  an  event  simultaneous 
with  it.  This  kernel  is  a  three-dimensional  linear  subspace  IR3  of  the  vector 
space  IR4. 

The  galilean  structure  includes  one  further  element. 

3.  The  distance  between  simultaneous  events 

p(a,  b)  =  ||a  -  b\\  =  J\a  -  b,  a  -  b)  a,  be  A3 

is  given  by  a  scalar  product  on  the  space  IR3.  This  distance  makes  every 
space  of  simultaneous  events  into  a  three-dimensional  euclidean  space  E3. 

4  Formerly,  the  universe  was  provided  not  with  an  affine,  but  with  a  linear  structure  (the  geo¬ 
centric  system  of  the  universe). 
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1 :  Experimental  facts 


A  space  A4,  equipped  with  a  galilean  space-time  structure,  is  called  a 
galilean  space. 

One  can  speak  of  two  events  occurring  simultaneously  in  different  places, 
but  the  expression  “two  non-simultaneous  events  a,  be  A4  occurring  at 
one  and  the  same  place  in  three-dimensional  space ”  has  no  meaning  as  long 
as  we  have  not  chosen  a  coordinate  system. 

The  galilean  group  is  the  group  of  all  transformations  of  a  galilean  space 
which  preserve  its  structure.  The  elements  of  this  group  are  called  galilean 
transformations.  Thus,  galilean  transformations  are  affine  transformations 
of  A4  which  preserve  intervals  of  time  and  the  distance  between  simultaneous 
events. 

Example.  Consider  the  direct  product5  IR  x  IR3  of  the  t  axis  with  a  three- 
dimensional  vector  space  IR3;  suppose  IR3  has  a  fixed  euclidean  structure. 
Such  a  space  has  a  natural  galilean  structure.  We  will  call  this  space  galilean 
coordinate  space. 

We  mention  three  examples  of  galilean  transformations  of  this  space. 
First,  uniform  motion  with  velocity  v: 

gft,  x)  =  (t,  x  +  vt)  Vt  g  IR,  x  g  IR3. 

Next,  translation  of  the  origin: 

g2(t,  x)  =  (t  +  s,  x  +  s)  Vf  g  IR,  x  g  IR3. 

Finally,  rotation  of  the  coordinate  axes: 

03 (t,  x)  =  (t,  Gx),  Vt  g  IR,  x  g  IR3, 

where  G:  IR3  -*  IR3  is  an  orthogonal  transformation. 

Problem.  Show  that  every  gaiilean  transformation  of  the  space  IR  x  IR3 
can  be  written  in  a  unique  way  as  the  composition  of  a  rotation,  a  translation, 
and  a  uniform  motion  {g  =  gt°  g2°  gf)  (thus  the  dimension  of  the  galilean 
group  is  equal  to  3  -f  4  +  3  =  10). 

Problem.  Show  that  all  galilean  spaces  are  isomorphic  to  each  other6 
and,  in  particular,  isomorphic  to  the  coordinate  space  IR  x  IR3. 

Let  M  be  a  set.  A  one-to-one  correspondence  (pl :  M  -*■  IR  x  IR3  is  called 
a  galilean  coordinate  system  on  the  set  M.  A  coordinate  system  (p2  moves 
uniformly  with  respect  to  (pl  if  x  IR3  -»  IR  x  IR3  is  a  galilean 

transformation.  The  galilean  coordinate  systems  q>l  and  <p2  give  M  the  same 
galilean  structure. 

5  Recall  that  the  direct  product  of  two  sets  A  and  B  is  the  set  of  ordered  pairs  (a,  b ),  where 
a  e  A  and  be  B.  The  direct  product  of  two  spaces  (vector,  affine,  euclidean)  has  the  structure  of  a 
space  of  the  same  type. 

6  That  is,  there  is  a  one-to-one  mapping  of  one  to  the  other  preserving  the  galilean  structure. 
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2:  The  galilean  group  and  Newton’s  equations 


C  Motion ,  velocity ,  acceleration 

A  motion  in  IRN  is  a  differentiable  mapping  x:  I  -►  [RN,  where  /  is  an  interval 
on  the  real  axis. 

The  derivative 


*(*o)  = 


dx 

dt 


=  Umxa£±«I1x(r2)s 


l  =  to 


ft-*  0 


is  called  the  velocity  vector  at  the  point  tQ  e  /. 
The  second  derivative 


is  called  the  acceleration  vector  at  the  point  tQ . 

We  will  assume  that  the  functions  we  encounter  are  continuously  differ¬ 
entiable  as  many  times  as  necessary.  In  the  future,  unless  otherwise  stated, 
mappings,  functions,  etc.  are  understood  to  be  differentiable  mappings, 
functions,  etc.  The  image  of  a  mapping  x :  /  -*■  RN  is  called  a  trajectory  or 
curve  in  IRN. 

Problem.  Is  it  possible  for  the  trajectory  of  a  differentiable  motion  on  the 
plane  to  have  the  shape  drawn  in  Figure  3?  Is  it  possible  for  the  acceleration 
vector  to  have  the  value  shown? 

Answer.  Yes.  No. 


a/ 

X 


Figure  3  Trajectory  of  motion  of  a  point 

We  now  define  a  mechanical  system  of  n  points  moving  in  three-dimensional 
euclidean  space . 

Let  x:  [R  -*■  IR3  be  a  motion  in  [R3.  The  graph7  of  this  mapping  is  a  curve 
in  IR  x  IR3. 

A  curve  in  galilean  space  which  appears  in  some  (and  therefore  every) 
galilean  coordinate  system  as  the  graph  of  a  motion,  is  called  a  world  line 
(Figure  4). 

7  The  graph  of  a  mapping/:  A  -*  B  is  the  subset  of  the  direct  product  A  x  B  consisting  of  ali 
pairs  ( a,  f(a ))  with  a  e  A. 
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1 :  Experimental  facts 


A  motion  of  a  system  of  n  points  gives,  in  galilean  space,  n  world  lines. 
In  a  galilean  coordinate  system  they  are  described  by  n  mappings  xf:  R  -*■  R3, 
i  =  1, . . n. 

The  direct  product  of  n  copies  of  R3  is  called  the  configuration  space 
of  the  system  of  n  points.  Our  n  mappings  x,  :  R  -►  R3  define  one  mapping 

x:  R  ->  RN  N  =  3n 

of  the  time  axis  into  the  configuration  space.  Such  a  mapping  is  also  called 
a  motion  of  a  system  of  n  points  in  the  galilean  coordinate  system  on  R  x  R3. 

D  Newton's  equations 

According  to  Newton’s  principle  of  determinacy  (Section  1C)  all  motions 
of  a  system  are  uniquely  determined  by  their  initial  positions  (x(f0)  e  R^) 
and  initial  velocities  (x(f0)  e  RN). 

In  particular,  the  initial  positions  and  velocities  determine  the  acceleration. 
In  other  words,  there  is  a  function  F:  RN  x  fRw  x  R  -*■  RN  such  that 

(1)  x  =  F(x,  x,  t ). 

Newton  used  Equation  (1)  as  the  basis  of  mechanics.  It  is  called  Newton's 
equation. 

By  the  theorem  of  existence  and  uniqueness  of  solutions  to  ordinary 
differential  equations,  the  function  F  and  the  initial  conditions  x(r0)  and 
x(t0)  uniquely  determine  a  motion.8 

For  each  specific  mechanical  system  the  form  of  the  function  F  is  deter¬ 
mined  experimentally.  From  the  mathematical  point  of  view  the  form  of  F 
for  each  system  constitutes  the  definition  of  that  system. 

E  Constraints  imposed  by  the  principle  of  relativity 

Galileo’s  principle  of  relativity  states  that  in  physical  space-time  there  is  a 
selected  galilean  structure  (“the  class  of  inertial  coordinate  systems”) 
having  the  following  property. 


8  Under  certain  smoothness  conditions,  which  we  assume  to  be  fulfilled.  In  general,  a  motion 
is  determined  by  Equation  (1)  only  on  some  interval  of  the  time  axis.  For  simplicity  we  will 
assume  that  this  interval  is  the  whole  time  axis,  as  is  the  case  in  most  problems  in  mechanics. 
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2:  The  galilean  group  and  Newton’s  equations 


x  x 


i - l - - t 

Figure  5  Galileo’s  principle  of  relativity 

If  we  subject  the  world  lines  of  all  the  points  of  any  mechanical  system9 
to  one  and  the  same  galilean  transformation,  we  obtain  world  lines  of  the 
same  system  (with  new  initial  conditions)  (Figure  5). 

This  imposes  a  series  of  conditions  on  the  form  of  the  right-hand  side  of 
Newton’s  equation  written  in  an  inertial  coordinate  system:  Equation  (1) 
must  be  invariant  with  respect  to  the  group  of  galilean  transformations. 

Example  1.  Among  the  galilean  transformations  are  the  time  translations. 
Invariance  with  respect  to  time  translations  means  that  “the  laws  of  nature 
remain  constant,”  i.e.,  if  x  =  <p(r)  is  a  solution  to  Equation  (1),  then  for  any 
s  e  R,  x  =  <p(r  +  s)  is  also  a  solution. 

From  this  it  follows  that  the  right-hand  side  of  Equation  (1)  in  an  inertial 
coordinate  system  does  not  depend  on  the  time: 

x  =  <l>(x,  x). 

Remark.  Differential  equations  in  which  the  right-hand  side  does  depend 
on  time  arise  in  the  following  situation. 

Suppose  that  we  are  studying  part  I  of  the  mechanical  system  I  +  II. 
Then  the  influence  of  part  II  on  part  I  can  sometimes  be  replaced  by  a  time 
variation  of  parameters  in  the  system  of  equations  describing  the  motion  of 
part  I.  For  example,  the  influence  of  the  moon  on  the  earth  can  be  ignored  in 
investigating  the  majority  of  phenomena  on  the  earth.  However,  in  the  study  of 
the  tides  this  influence  must  be  taken  into  account;  one  can  achieve  this  by 
introducing,  instead  of  the  attraction  of  the  moon,  periodic  changes  in  the 
strength  of  gravity  on  earth. 


9  In  formulating  the  principle  of  relativity  we  must  keep  in  mind  that  it  is  relevant  only  to 
closed  physical  (in  particular,  mechanical)  systems,  i.e.,  that  we  must  include  in  the  system  all 
bodies  whose  interactions  play  a  role  in  the  study  of  the  given  phenomena.  Strictly  speaking,  we 
should  include  in  the  system  all  bodies  in  the  universe.  But  we  know  from  experience  that  one 
can  disregard  the  effect  of  many  of  them:  for  example,  in  studying  the  motion  of  planets  around 
the  sun  we  can  disregard  the  attractions  among  the  stars,  etc. 

On  the  other  hand,  in  the  study  of  a  body  in  the  vicinity  of  earth,  the  system  is  not  closed 
if  the  earth  is  not  included;  in  the  study  of  the  motion  of  an  airplane  the  system  is  not  closed  if 
it  does  not  include  the  air  surrounding  the  airplane,  etc.  In  the  future,  the  term  “mechanical 
system”  will  mean  a  closed  system  in  most  cases,  and  when  there  is  a  non-closed  system  in 
question  this  will  be  explicitly  stated  (cf.,  for  example.  Section  3). 
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1 :  Experimental  facts 


Equations  with  variable  coefficients  can  appear  also  as  the  result  of  formal 
operations  in  the  solution  of  problems. 

Example  2.  Translations  in  three-dimensional  space  are  galilean  trans¬ 
formations.  Invariance  with  respect  to  such  translations  means  that  space 
is  homogeneous,  or  “has  the  same  properties  at  all  of  its  points.”  That  is, 
if  x,  =  q>;(00  =  1,  •  ■  • ,  n)  is  a  motion  of  a  system  of  n  points  satisfying  (1), 
then  for  any  r  e  R3  the  motion  <p,(t)  +  r(i=  1, . . . ,  n)  also  satisfies  Equation 
(1)- 

From  this  it  follows  that  the  right-hand  side  of  Equation  (1)  in  the  inertial 
coordinate  system  can  depend  only  on  the  “relative  coordinates”  x,  —  xk. 

From  invariance  under  passage  to  a  uniformly  moving  coordinate  system 
(which  does  not  change  x,-  or  Xj  -  xk,  but  adds  to  each  x,  a  fixed  vector  v)  it 
follows  that  the  right-hand  side  of  Equation  (1)  in  an  inertial  system  of 
coordinates  can  depend  only  on  the  relative  velocities 

-  xk,  xj  -  XJ),  i,j,k=  1 

Example  3.  Among  the  galilean  transformations  are  the  rotations  in  three- 
dimensional  space.  Invariance  with  respect  to  these  rotations  means  that 
space  is  isotropic;  there  are  no  preferred  directions. 

Thus,  if  :  R  ->  R3(i  —  1, . . . ,  n)  is  a  motion  of  a  system  of  points  satis¬ 
fying  (1),  and  G:  R3  -*■  R3  is  an  orthogonal  transformation,  then  the  motion 
G<p, :  R  -y  R3(/, . . .  ,n)  also  satisfies  (1).  In  other  words. 

F(Gx,  G  x)  =  GF(x,  x), 

where  Gx  denotes  (Gxl5 . . . ,  Gx„ ),  xt  e  R3. 

Problem.  Show  that  if  a  mechanical  system  consists  of  only  one  point,  then 
its  acceleration  in  an  inertial  coordinate  system  is  equal  to  zero  (“Newton’s 
first  law”). 

Hint.  By  Examples  1  and  2  the  acceleration  vector  does  not  depend  on 
x,  x,  or  t,  and  by  Example  3  the  vector  F  is  invariant  with  respect  to  rotation. 

Problem.  A  mechanical  system  consists  of  two  points.  At  the  initial  moment 
their  velocities  (in  some  inertial  coordinate  system)  are  equal  to  zero.  Show 
that  the  points  will  stay  on  the  line  which  connected  them  at  the  initial 
moment. 

Problem.  A  mechanical  system  consists  of  three  points.  At  the  initial  moment 
their  velocities  (in  some  inertial  coordinate  system)  are  equal  to  zero. 
Show  that  the  points  always  remain  in  the  plane  which  contained  them  at  the 
initial  moment. 

Problem.  A  mechanical  system  consists  of  two  points.  Show  that  for  any 
initial  conditions  there  exists  an  inertial  coordinate  system  in  which  the 
two  points  remain  in  a  fixed  plane. 
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3 :  Examples  of  mechanical  systems 


Problem.  Show  that  mechanics  “through  the  looking  glass”  is  identical 
to  ours. 

Him.  In  the  galilean  group  there  is  a  reflection  transformation,  changing 
the  orientation  of  R3. 

Problem.  Is  the  class  of  inertial  systems  unique? 

Answer.  No.  Other  classes  can  be  obtained  if  one  changes  the  units  of  length 
and  time  or  the  direction  of  time. 


3  Examples  of  mechanical  systems 

We  have  already  remarked  that  the  form  of  the  function  F  in  Newton’s  equation  ( 1 )  is  determined 
experimentally  for  each  mechanical  system.  Here  are  several  examples. 

In  examining  concrete  systems  it  is  reasonable  not  to  include  all  the  objects  of  the  universe 
in  a  system.  For  example,  in  studying  the  majority  of  phenomena  taking  place  on  the  earth  we 
can  ignore  the  influence  of  the  moon.  Furthermore,  it  is  usually  possible  to  disregard  the  effect 
of  the  processes  we  are  studying  on  the  motion  of  the  earth  itself;  we  may  even  consider  a  coordi¬ 
nate  system  attached  to  the  earth  as  “fixed.”  It  is  clear  that  the  principle  of  relativity  no  longer 
imposes  the  constraints  found  m  Section  2  for  equations  of  motion  written  in  such  a  coordinate 
system.  For  example,  near  the  earth  there  is  a  distinguished  direction,  the  vertical. 

A  Example  1 :  A  stone  falling  to  the  earth 
Experiments  show  that 

(2)  x  =  —g,  where  g  x  9.8  m/s2  (Galileo)* 

where  x  is  the  height  of  a  stone  above  the  surface  of  the  earth. 

If  we  introduce  the  “potential  energy”  V  =  gx,  then  Equation  (2)  can 
be  written  in  the  form 


_  _  dU 

dx 

If  U :  EN  -►  R  is  a  differentiable  function  on  euclidean  space,  then  we  will 
denote  by  dU/dx  the  gradient  of  the  function  U.  If  EN  =  E"'  x  •  •  •  x  E"k 
is  a  direct  product  of  euclidean  spaces,  then  we  will  denote  a  point  x  e  EN 
by  (xj, . . . ,  x*),  and  the  vector  dUjdx  by  (dU/dxu  dU/dxk).  In  particular, 
if  x j,...,  xN  are  cartesian  coordinates  in  EN,  then  the  components  of  the 
vector  dU/dx  are  the  partial  derivatives  dU/dxu  . . . ,  dU/dxN. 

Experiments  show  that  the  radius  vector  of  the  stone  with  respect  to 
some  point  0  on  the  earth  satisfies  the  equation 

..  dU 

(3)  x  =  — — ,  where  U  =  —  (g,  x) 

*  In  this  and  other  sections,  the  mass  of  a  particle  is  taken  to  be  1. 
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1 :  Experimental  facts 


The  vector  in  the  right-hand  side  is  directed  towards  the  earth.  It  is  called 
the  gravitational  acceleration  vector  g.  (Figure  6.) 


A 


'  r 


7777777777777777777. 

Figure  6  A  stone  falling  to  the  earth 


B  Example  2:  Falling  from  great  height 

Like  all  experimental  facts,  the  law  of  motion  (2)  has  a  restricted  domain  of 
application.  According  to  a  more  precise  law  of  falling  bodies,  discovered 
by  Newton,  acceleration  is  inversely  proportional  to  the  square  of  the  distance 
from  the  center  of  the  earth : 


where  r  =  r0  +  x  (Figure  7). 


Figure  7  The  earth’s  gravitational  field 


This  equation  can  also  be  written  in  the  form  (3),  if  we  introduce  the 
potential  energy 

U  = -  k  =  grl, 

r 

inversely  proportional  to  the  distance  to  the  center  of  the  earth. 

Problem.  Determine  with  what  velocity  a  stone  must  be  thrown  in  order  that 
it  fly  infinitely  far  from  the  surface  of  the  earth.10 

Answer.  >11.2  km/sec. 


10  This  is  the  so-called  second  cosmic  velocity  v2.  Our  equation  does  not  take  into  account  the 
attraction  of  the  sun.  The  attraction  of  the  sun  will  not  let  the  stone  escape  from  the  solar  system 
if  the  velocity  of  the  stone  with  respect  to  the  earth  is  less  than  16.6  km/sec. 
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3:  Examples  of  mechanical  systems 


C  Example  3 :  Motion  of  a  weight  along  a  line 
under  the  action  of  a  spring 

Experiments  show  that  under  small  extensions  of  the  spring  the  equation 
of  motion  of  the  weight  will  be  (Figure  8) 

x  -  —  a2x. 


Figure  8  Weight  on  a  spring 


This  equation  can  also  be  written  in  the  form  (3)  if  we  introduce  the 
potential  energy 


If  we  replace  our  one  weight  by  two  weights,  then  it  turns  out  that,  under 
the  same  extension  of  the  spring,  the  acceleration  is  half  as  large. 

It  is  experimentally  established  that  for  any  two  bodies  the  ratio  of  the 
accelerations  x1/x2  under  the  same  extension  of  a  spring  is  fixed  (does  not 
depend  on  the  extent  of  extension  of  the  spring  or  on  its  characteristics,  but 
only  on  the  bodies  themselves).  The  value  inverse  to  this  ratio  is  by  definition 
the  ratio  of  masses: 


For  a  unit  of  mass  we  take  the  mass  of  some  fixed  body,  e.g.,  one  liter  of 
water.  We  know  by  experience  that  the  masses  of  all  bodies  are  positive.  The 
product  of  mass  times  acceleration  mx  does  not  depend  on  the  body,  and 
is  a  characteristic  of  the  extension  of  the  spring.  This  value  is  called  the 
force  of  the  spring  acting  on  the  body. 

As  a  unit  of  force,  we  take  the  ‘'newton.”  If  one  liter  of  water  is  suspended 
on  a  spring  at  the  surface  of  the  earth,  the  spring  acts  with  a  force  of  9.8 
newtons  ( =  1  kg). 

D  Example  4:  Conservative  systems 

Let  E3a  =  E3  x  •  •  •  x  E3  be  the  configuration  space  of  a  system  of  n  points 
in  the  euclidean  space  E3.  Let  U :  E3n  -►  IR  be  a  differentiable  function  and 
let  mt, . . . ,  m„  be  positive  numbers. 
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1 :  Experimental  facts 


Definition.  The  motion  of  n  points,  of  masses  mlt . . . ,  m„,  in  the  potential 
field  with  potential  energy  U  is  given  by  the  system  of  differential  equations 

8U  .  , 

(4)  ">'*'= 

The  equations  of  motion  in  Examples  1  to  3  have  this  form.  The  equations 
of  motion  of  many  other  mechanical  systems  can  be  written  in  the  same  form. 
For  example,  the  three-body  problem  of  celestial  mechanics  is  problem  (4) 
in  which 


m1m2  m2m3  m3m1 

||xt  -  x2j|  ||x2  -  x3||  ||X3  -  Xjir 

Many  different  equations  of  entirely  different  origin  can  be  reduced  to 
form  (4),  for  example  the  equations  of  electrical  oscillations.  In  the  following 
chapter  we  will  study  mainly  systems  of  differential  equations  in  the  form  (4). 
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Investigation  of  the  equations 

of  motion 


In  most  cases  (for  example,  in  the  three-body  problem)  we  can  neither  solve 
the  system  of  differential  equations  nor  completely  describe  the  behavior 
of  the  solutions.  In  this  chapter  we  consider  a  few  simple  but  important 
problems  for  which  Newton’s  equations  can  be  solved. 

4  Systems  with  one  degree  of  freedom 

In  this  paragraph  we  study  the  phase  flow  of  the  differential  equation  ( 1 ).  A  look  at  the  graph  of 
the  potential  energy  is  enough  for  a  qualitative  analysis  of  such  an  equation.  In  addition,  Equation 
(1)  is  integrated  by  quadratures. 

A  Definitions 

A  system  with  one  degree  of  freedom  is  a  system  described  by  one  differential 
equation 

(1)  x  =  f(x)  xeU. 

The  kinetic  energy  is  the  quadratic  form* 

T  =  ix2. 

The  potential  energy  is  the  function 

u(x)  =  -  fnm. 

•Ixo 

The  sign  in  this  formula  is  taken  so  that  the  potential  energy  of  a  stone  is 
larger  if  the  stone  is  higher  off  the  ground. 

Notice  that  the  potential  energy  determines  /.  Therefore,  to  specify  a 
system  of  the  form  (1)  it  is  enough  to  give  the  potential  energy.  Adding  a 
constant  to  the  potential  energy  does  not  change  the  equation  of  motion  (1). 

*  see  footnote  on  p.  1 1. 
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2:  Investigation  of  the  equations  of  motion 


The  total  energy  is  the  sum 

E=T  +  U. 

In  general,  the  total  energy  is  a  function,  E(x,  x),  of  x  and  x. 

Theorem  (The  law  of  conservation  of  energy).  The  total  energy  of  points 
moving  according  to  the  equation  (1)  is  conserved :  E(x(t),  x(t))  is  independent 
of  t. 

Proof. 

(T  +  U)  —  xx  +  ^  x  =  x(x  —  f(x))  =  0.  □ 

dt  dx 


B  Phase  flow 

Equation  (1)  is  equivalent  to  the  system  of  two  equations: 

(2)  x  =  y  y  -  f(x). 

We  consider  the  plane  with  coordinates  x  and  y,  which  we  call  the  phase  plane 
of  Equation  (1).  The  points  of  the  phase  plane  are  called  phase  points.  The 
right-hand  side  of  (2)  determines  a  vector  field  on  the  phase  plane,  called  the 
phase  velocity  vector  field. 

A  solution  of  (2)  is  a  motion  <p:  R  -*■  IR2  of  a  phase  point  in  the  phase 
plane,  such  that  the  velocity  of  the  moving  point  at  each  moment  of  time  is 
equal  to  the  phase  velocity  vector  at  the  location  of  the  phase  point  at  that 
moment.11 

The  image  of  <p  is  called  the  phase  curve.  Thus  the  phase  curve  is  given  by 
the  parametric  equations 

x  =  cp{t)  y  =  (p(t). 

Problem.  Show  that  through  every  phase  point  there  is  one  and  only  one 
phase  curve. 

Hint.  Refer  to  a  textbook  on  ordinary  differential  equations. 

We  notice  that  a  phase  curve  could  consist  of  only  one  point.  Such  a 
point  is  called  an  equilibrium  position.  The  vector  of  phase  velocity  at  an 
equilibrium  position  is  zero. 

The  law  of  conservation  of  energy  allows  one  to  find  the  phase  curves 
easily.  On  each  phase  curve  the  value  of  the  total  energy  is  constant.  Therefore, 
each  phase  curve  lies  entirely  in  one  energy  level  set  £(x,  y)  =  h. 

C  Examples 

Example  1.  The  basic  equation  of  the  theory  of  oscillations  is 

x  =  —x. 


11  Here  we  assume  for  simplicity  that  the  solution  <p  is  defined  on  the  whole  time  axis  R. 
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4:  Systems  with  one  degree  of  freedom 


y 


Figure  9  Phase  plane  of  the  equation  x  =  —  x 
In  this  case  (Figure  9)  we  have: 


The  energy  level  sets  are  the  concentric  circles  and  the  origin.  The  phase 
velocity  vector  at  the  phase  point  (x,  y)  has  components  (y,  —  x).  It  is 
perpendicular  to  the  radius  vector  and  equal  to  it  in  magnitude.  Therefore, 
the  motion  of  the  phase  point  in  the  phase  plane  is  a  uniform  motion  around 
0:  x  =  r0  cos (<p0  —  t),  y  =  r0  sin(<p0  —  t).  Each  energy  level  set  is  a  phase 
curve. 

Example  2.  Suppose  that  a  potential  energy  is  given  by  the  graph  in  Figure 
10.  We  will  draw  the  energy  level  sets  jy2  +  U(x)  =  E.  For  this,  the  following 
facts  are  helpful. 

1.  Any  equilibrium  position  of  (2)  must  lie  on  the  x  axis  of  the  phase  plane. 
The  point  x  =  £,  y  =  0  is  an  equilibrium  position  if  £  is  a  critical  point 
of  the  potential  energy,  i.e.,  if  {dU/dx)\x=i  =  0. 

2.  Each  level  set  is  a  smooth  curve  in  a  neighborhood  of  each  of  its  points 
which  is  not  an  equilibrium  position  (this  follows  from  the  implicit 
function  theorem).  In  particular,  if  the  number  E  is  not  a  critical  value  of 
the  potential  energy  (i.e.,  is  not  the  value  of  the  potential  energy  at  one  of 
its  critical  points),  then  the  level  set  on  which  the  energy  is  equal  to  E 
is  a  smooth  curve. 

It  follows  that  in  order  to  study  the  energy  level  curve,  we  should  turn 
our  attention  to  the  critical  and  near-critical  values  of  E.  It  is  convenient 
here  to  imagine  a  little  ball  rolling  in  the  potential  well  U. 

For  example,  consider  the  following  argument:  “Kinetic  energy  is 
nonnegative.  This  means  that  potential  energy  is  less  than  or  equal  to  the 
total  energy.  The  smaller  the  potential  energy,  the  greater  the  velocity.” 
This  translates  to:  “The  ball  cannot  jump  out  of  the  potential  well,  rising 


17 


2:  Investigation  of  the  equations  of  motion 


U 


Figure  10  Potential  energy  and  phase  curves 

higher  than  the  level  determined  by  its  initial  energy.  As  it  falls  into  the  well, 
the  ball  gains  velocity.”  We  also  notice  that  the  local  maximum  points  of  the 
potential  energy  are  unstable,  but  the  minimum  points  are  stable  equilibrium 
positions. 

Problem.  Prove  this. 

Problem.  How  many  phase  curves  make  up  the  separatrix  (figure  eight) 
curve,  corresponding  to  the  level  E2  ? 

Answer  .  Three. 

Problem.  Determine  the  duration  of  motion  along  the  separatrix. 

Answer.  It  follows  from  the  uniqueness  theorem  that  the  time  is  infinite. 

Problem.  Show  that  the  time  it  takes  to  go  from  x2  to  x2  (in  one  direction) 
is  equal  to 


t2  —  t  j 


*1  y/2 (E  -  U(x)) 
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4:  Systems  with  one  degree  of  freedom 


U  U 


Problem.  Draw  the  phase  curves,  given  the  potential  energy  graphs  in 
Figure  11. 

Answer.  Figure  12. 

X  X 


Figure  12  Phase  curves 

Problem.  Draw  the  phase  curves  for  the  “equation  of  an  ideal  planar 
pendulum”:  x  =  —sin  x. 

Problem.  Draw  the  phase  curves  for  the  “equation  of  a  pendulum  on  a 
rotating  axis”:  x  =  —  sin  x  +  M. 

Remark.  In  these  two  problems  x  denotes  the  angle  of  displacement  of  the 
pendulum.  The  phase  points  whose  coordinates  differ  by  2 n  correspond  to 
the  same  position  of  the  pendulum.  Therefore,  in  addition  to  the  phase  plane, 
it  is  natural  to  look  at  the  phase  cylinder  {x(mod  27t),  y}. 

Problem.  Find  the  tangent  lines  to  the  branches  of  the  critical  level  corre¬ 
sponding  to  maximal  potential  energy  E  =  £/(£)  (Figure  13). 

Answer,  y  =  +  ^J—U'XQix  -  0- 
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2:  Investigation  of  the  equations  of  motion 


Figure  13  Critical  energy  level  lines 


Problem.  Let  S(E)  be  the  area  enclosed  by  the  closed  phase  curve  cor¬ 
responding  to  the  energy  level  E.  Show  that  the  period  of  motion  along 
this  curve  is  equal  to 


Problem.  Let  E0  be  the  value  of  the  potential  function  at  a  minimum  point 
£  Find  the  period  T0  =  lim£_>£o  T(£)  of  small  oscillations  in  a  neighbor¬ 
hood  of  the  point  £. 


Answer.  2 iz/y/U'XZ). 

Problem.  Consider  a  periodic  motion  along  the  closed  phase  curve  corre¬ 
sponding  to  the  energy  level  E.  Is  it  stable  in  the  sense  of  Liapunov?12 

Answer.  No.13 


D  Phase  flow 

Let  M  be  a  point  in  the  phase  plane.  We  look  at  the  solution  to  system  (2) 
whose  initial  conditions  at  t  =  0  are  represented  by  the  point  M.  We  assume 
that  any  solution  of  the  system  can  be  extended  to  the  whole  time  axis.  The 
value  of  our  solution  at  any  value  of  t  depends  on  M.  We  denote  the  resulting 
phase  point  (Figure  14)  by 

M(t)  =  g'M. 

In  this  way  we  have  defined  a  mapping  of  the  phase  plane  to  itself, 
gl  \  !R2  _►  |R2.  By  theorems  in  the  theory  of  ordinary  differential  equations, 

12  For  a  definition,  see,  e.g.,  p.  1 55  of  Ordinary  Differential  Equations  by  V.  I.  Arnold,  MIT  Press, 
1973. 

13  The  only  exception  is  the  case  when  the  period  does  not  depend  on  the  energy. 
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4:  Systems  with  one  degree  of  freedom 


Figure  14  Phase  flow 


the  mapping  g{  is  a  diffeomorphism  (a  one-to-one  differentiable  mapping 
with  a  differentiable  inverse).  The  diffeomorphisms  g\  t  e  R,  form  a  group: 
gt+s  =  g 1  °gs.  The  mapping  g°  is  the  identity  ( g°M  =  M),  and  g~l  is  the 
inverse  of  g(.  The  mapping  g:U  x  R2  -►  R2,  defined  by  g(t,  M)  =  g*M  is 
differentiable.  All  these  properties  together  are  expressed  by  saying  that  the 
transformations  g‘  form  a  one-parameter  group  of  diffeomorphisms  of  the  phase 
plane.  This  group  is  also  called  the  phase  flow,  given  by  system  (2)  (or 
Equation  (1)). 

Example.  The  phase  flow  given  by  the  equation  x  =  —x  is  the  group  gx 
of  rotations  of  the  phase  plane  through  angle  t  around  the  origin. 

Problem.  Show  that  the  system  with  potential  energy  U  =  —  x*  does  not 
define  a  phase  flow. 

Problem.  Show  that  if  the  potential  energy  is  positive,  then  there  is  a  phase 
flow. 

Hint.  Use  the  law  of  conservation  of  energy  to  show  that  a  solution  can 
be  extended  without  bound. 

Problem.  Draw  the  image  of  the  circle  x2  +  (y  —  l)2  <  £  under  the  action 
of  a  transformation  of  the  phase  flow  for  the  equations  (a)  of  the  “inverse 
pendulum,”  x  =  x  and  (b)  of  the  “nonlinear  pendulum,”  x  =  —sin  x. 

Answer.  Figure  15. 
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2 ;  Investigation  of  the  equations  of  motion 


5  Systems  with  two  degrees  of  freedom 

Analyzing  a  general  potential  system  with  two  degrees  of  freedom  is  beyond  the  capability 
of  modern  science.  In  this  paragraph  we  look  at  the  simplest  examples. 

A  Definitions 

By  a  system  with  two  degrees  of  freedom  we  will  mean  a  system  defined  by 
the  differential  equations 

(1)  x  =  f(x),  x  e  E2, 

where  f  is  a  vector  field  on  the  plane. 

A  system  is  said  to  be  conservative  if  there  exists  a  function  U :  E2  -*■  0? 
such  that  f  =  —dU/dx.  The  equation  of  motion  of  a  conservative  system 
then  has  the  form14  x  =  —  dU/dx. 

B  The  law  of  conservation  of  energy 

Theorem.  The  total  energy  of  a  conservative  system  is  conserved,  i.e., 
dE 

—  =  0,  where  E  =  jx2  -f  U(x),  x2  =  (x,  x). 
dt 

Proof.  dE/dt  =  (x,  x)  -f  ( dU/dx ,  x)  =  (x  +  (dU/dx),  x)  =  0  by  the  equation 
of  motion.  □ 


Corollary.  If  at  the  initial  moment  the  total  energy  is  equal  to  E ,  then  all 
trajectories  lie  in  the  region  where  U(x)  <  E,  i.e,,  a  point  remains  inside 
the  potential  well  U(xt,  x2 )  <  E  for  all  time. 


Remark.  In  a  system  with  one  degree  of  freedom  it  is  always  possible  to 
introduce  the  potential  energy 


U(x)  = 


For  a  system  with  two  degrees  of  freedom  this  is  not  so. 


Problem.  Find  an  example  of  a  system  of  the  form  x  =  f(x),  x  e  E2,  which  is 
not  conservative. 


C  Phase  space 

The  equation  of  motion  (1)  can  be  written  as  the  system: 


(2) 


=  y  i 


dU 


*2  —  y  2 


h=  - 


dU 

dx2 


14  In  cartesian  coordinates  on  the  plane  E2,  Jc j  =  —dU/dXi  and  x2  =  -dU/dx2- 
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5 :  Systems  with  two  degrees  of  freedom 


The  phase  space  of  a  system  with  two  degrees  of  freedom  is  the  four¬ 
dimensional  space  with  coordinates  xx,  x2,  yx,  and  y2 . 

The  system  (2)  defines  the  phase  velocity  vector  field  in  four  space  as  well 
as15  the  phase  flow  of  the  system  (a  one-parameter  group  of  difleomorphisms 
of  four-dimensional  phase  space).  The  phase  curves  of  (2)  are  subsets  of  four¬ 
dimensional  phase  space.  All  of  phase  space  is  partitioned  into  phase  curves. 
Projecting  the  phase  curves  from  four  space  to  the  xx,  x2  plane  gives  the 
trajectories  of  our  moving  point  in  the  xx,  x2  plane.  These  trajectories  are 
also  called  orbits.  Orbits  can  have  points  of  intersection  even  when  the  phase 
curves  do  not  intersect  one  another.  The  equation  of  the  law  of  conservation 
of  energy 

E  =  Y  +  U(x)  =  +  u(Xlt  Xl) 

defines  a  three-dimensional  hypersurface  in  four  space:  £(xx,  x2,  yx,  y2)  = 
£0;  this  surface,  nEo,  remains  invariant  under  the  phase  flow:  gf'7i£o  =  nEo. 
One  could  say  that  the  phase  flow  flows  along  the  energy  level  hypersurfaces. 
The  phase  velocity  vector  field  is  tangent  at  every  point  to  nEo.  Therefore, 
nEo  is  entirely  composed  of  phase  curves  (Figure  16). 


>’2 


Figure  16  Energy  level  surface  and  phase  curves 

Example  1  (“small  oscillations  of  a  spherical  pendulum”).  Let  U  =  ^(xx  +  x2). 
The  level  sets  of  the  potential  energy  in  the  xx,  x2  plane  will  be  concentric 
circles  (Figure  17). 

The  equations  of  motion,  xx  =  —  xx,  x2  =  —  x2,  are  equivalent  to  the 
system 

*i  =  yi  *2=  y2 

h  =  -*i  y2  =  -*2- 

This  system  decomposes  into  two  independent  ones;  in  other  words, 
each  of  the  coordinates  xx  and  x2  changes  with  time  in  the  same  way  as  in 
a  system  with  one  degree  of  freedom. 

15  With  the  usual  limitations. 
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2:  Investigation  of  the  equations  of  motion 


x 2 


Figure  17  Potential  energy  level  curves  for  a  spherical  pendulum 
A  solution  has  the  form 

Xj  =  c1  cos  t  +  c2  sin  t  x2  =  c3  cos  t  4-  c4  sin  f 

=  ~Ci  sin  t  +  c2  cos  t  y2  =  —  c3  sin  t  4-  c4  cos  t. 

It  follows  from  the  law  of  conservation  of  energy  that 

E  =  iCVi  +  y\)  +  t(*i  +  =  const, 

i.e.,  the  level  surface  nEo  is  a  sphere  in  four  space. 

Problem.  Show  that  the  phase  curves  are  great  circles  of  this  sphere.  (A 
great  circle  is  the  intersection  of  a  sphere  with  a  two-dimensional  plane 
passing  through  its  center.) 

Problem.  Show  that  the  set  of  phase  curves  on  the  surface  nEo  forms  a  two- 
dimensional  sphere.  The  formula  w  =  (xj  +  iy i)/(x2  +  iy2)  gives  the  “Hopf 
map”  from  the  three  sphere  nEo  to  the  two  sphere  (the  complex  w-plane 
completed  by  the  point  at  infinity).  Our  phase  curves  are  the  pre-images 
of  points  under  the  Hopf  map. 

Problem.  Find  the  projection  of  the  phase  curves  on  the  x1?  x2  plane  (i.e., 
draw  the  orbits  of  the  motion  of  a  point). 

Example  2  (“ Lissajous  figures”).  We  look  at  one  more  example  of  a  planar 
motion  (“small  oscillations  with  two  degrees  of  freedom”): 

xt  =  —Xi  x2  =  —oj2x2. 

The  potential  energy  is 

U  =  {x\  +  jco2xi . 

From  the  law  of  conservation  of  energy  it  follows  that,  if  at  the  initial 
moment  of  time  the  total  energy  is 

K*i  +  *1)  +  U(xlt  x2)  =  E, 

then  all  motions  will  take  place  inside  the  ellipse  U(x2,  x2)  <  E. 
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5 .  Systems  with  two  degrees  of  freedom 


Our  system  consists  of  two  independent  one-dimensional  systems.  There¬ 
fore,  the  law  of  conservation  of  energy  is  satisfied  for  each  of  them  separately, 
i.e.,  the  following  quantities  are  preserved 

E1  =  \xj  +  \x\  E2  =  \x\  +  j(o2x22  (E  =  Ex  +  E2). 

Consequently,  the  variable  xx  is  bounded  by  the  region  |x,|  <  AX,  Ax  = 
n/2£1(0),  and  x2  oscillates  within  the  region  \x2  \  <  A2.  The  intersection 
of  these  two  regions  defines  a  rectangle  which  contains  the  orbits  (Figure  18). 


x2 


Problem.  Show  that  this  rectangle  is  inscribed  in  the  ellipse  U  <  E. 

The  general  solution  of  our  equations  is  xx  =  Ax  sin(t  +  </>!),  x2  = 
A2  sin(cor  +  (p2 );  a  moving  point  independently  performs  an  oscillation 
with  frequency  1  and  amplitude  Ax  along  the  horizontal  and  an  oscillation 
with  frequency  (o  and  amplitude  A2  along  the  vertical. 

Consider  the  following  method  of  describing  an  orbit  in  the  xl5  x2  plane. 
We  look  at  a  cylinder  with  base  2AX  and  a  band  of  width  2A2 .  We  draw  on 
the  band  a  sine  wave  with  period  2nA1/(o  and  amplitude  A2  and  wind  the 
band  onto  the  cylinder  (Figure  19).  The  orthogonal  projection  of  the  sinusoid 


X2 


Xl 


Figure  19  Construction  of  a  Lissajous  figure 
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2:  Investigation  of  the  equations  of  motion 


wound  around  the  cylinder  onto  the  xls  x2  plane  gives  the  desired  orbit, 
called  a  Lissajous  figure. 

Lissajous  figures  can  conveniently  be  seen  on  an  oscilloscope  which  dis¬ 
plays  independent  harmonic  oscillations  on  the  horizontal  and  vertical  axes. 

The  form  of  a  Lissajous  figure  very  strongly  depends  on  the  frequency  a>. 
If  a>  =  1  (the  spherical  pendulum  of  Example  1),  then  the  curve  on  the 
cylinder  is  an  ellipse.  The  projection  of  this  ellipse  onto  the  xl5  x2  plane 
depends  on  the  difference  <p2  —  <pi  between  the  phases.  For  (pt  =  (p2  we  get 
a  segment  of  the  diagonal  of  the  rectangle;  for  small  <p2  —  (Pi  we  get  an 
ellipse  close  to  the  diagonal  and  inscribed  in  the  rectangle.  For  q>2  —  <p  l  =  nfl 
we  get  an  ellipse  with  major  axes  x1(  x2;  as  (p2  —  (px  increases  from  njl 
to  n  the  ellipse  collapses  onto  the  second  diagonal;  as  (p2  -  increases 
further  the  whole  process  is  repeated  from  the  beginning  (Figure  20). 


x2 


Now  let  the  frequencies  be  only  approximately  equal:  a>  ss  1.  The  segment 
of  the  curve  corresponding  to  0  <  t  <  2n  is  very  close  to  an  ellipse.  The  next 
loop  also  reminds  one  of  an  ellipse,  but  here  the  phase  shift  (p2  —  <Pi  is 
greater  than  in  the  original  by  2n(co  —  1).  Therefore,  the  Lissajous  curve 
with  cos;  1  is  a  distorted  ellipse,  slowly  progressing  through  all  phases 
from  collapsed  onto  one  diagonal  to  collapsed  onto  the  other  (Figure  21). 

If  one  of  the  frequencies  is  twice  the  other  (to  =  2),  then  for  some  particular 
phase  shift  the  Lissajous  figure  becomes  a  doubly  traversed  arc  (Figure  22). 


*2 
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5 :  Systems  with  two  degrees  of  freedom 


Problem.  Show  that  this  curve  is  a  parabola.  By  increasing  the  phase  shift 
<p2  ~  <Pi  we  §et  *n  turn  the  curves  in  Fig.  23. 

In  general,  if  one  of  the  frequencies  is  n  times  bigger  than  the  other  (co  =  n ), 
then  among  the  graphs  of  the  corresponding  Lissajous  figures  there  is  the 
graph  of  a  polynomial  of  degree  n  (Figure  24);  this  polynomial  is  called  a 
Chebyshev  polynomial. 


.V? 


Figure  22  Lissajous  figure  with  to  =  2 

*2  X2 


Figure  23  Series  of  Lissajous  figures  with  (o  =  2 


X2  X2  x2 


Figure  24  Chebyshev  polynomials 
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2:  Investigation  of  the  equations  of  motion 


Problem.  Show  that  if  co  =  m/n,  then  the  Lissajous  figure  is  a  closed  algebraic 
curve;  but  if  co  is  irrational,  then  the  Lissajous  figure  fills  the  rectangle  every¬ 
where  densely.  What  does  the  corresponding  phase  trajectory  fill  out? 

6  Conservative  force  fields 

In  this  section  we  study  the  connection  between  work  and  potential  energy. 

A  Work  of  a  force  field  along  a  path 

Recall  the  definition  of  the  work  by  a  force  F  on  a  path  S.  The  work  of  the 
constant  force  F  (for  example,  the  force  with  which  we  lift  up  a  load)  on  the 


M2 


Figure  25  Work  of  the  constant  force  F  along  the  straight  path  S 


path  S  =  MjM2  is,  by  definition,  the  scalar  product  (Figure  25) 

A  =  (F,  S)  =  |  F 1 1 S  |  •  cos  (p. 

Suppose  we  are  given  a  vector  field  F  and  a  curve  /  of  finite  length.  We 
approximate  the  curve  l  by  a  polygonal  line  with  components  AS,  and  denote 
by  Fj  the  value  of  the  force  at  some  particular  point  of  AS,  ;  then  the  work  of 
the  field  F  on  the  path  /  is  by  definition  (Figure  26) 

A=  lim  X(Ft,AS,). 

IAS.I-0 

In  analysis  courses  it  is  proved  that  if  the  field  is  continuous  and  the  path 
rectifiable,  then  the  limit  exists.  It  is  denoted  by  j,  (F,  dS). 


Figure  26  Work  of  the  force  field  F  along  the  path  / 
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6:  Conservative  force  fields 


B  Conditions  for  a  field  to  be  conservative 

Theorem.  A  vector  field  F  is  conservative  if  and  only  if  its  work  along  any 
path  MlM2  depends  only  on  the  endpoints  of  the  path,  and  not  on  the  shape 
of  the  path. 

Proof.  Suppose  that  the  work  of  a  field  F  does  not  depend  on  the  path.  Then 

/*Af 

U(M )  =  -  (F,  dS) 

J  Mo 

is  well  defined  as  a  function  of  the  point  M.  It  is  easy  to  verify  that 


i.e.,  the  field  is  conservative  and  U  is  its  potential  energy.  Of  course,  the 
potential  energy  is  defined  only  up  to  the  additive  constant  U(M0),  which 
can  be  chosen  arbitrarily. 

Conversely,  suppose  that  the  field  F  is  conservative  and  that  U  is  its 
potential  energy.  Then  it  is  easily  verified  that 

[M 

(F,  dS)  =  -  U(M)  +  U(M0\ 

•l  M0 

i.e.,  the  work  does  not  depend  on  the  shape  of  the  path.  □ 

Problem.  Show  that  the  vector  field  F j  =  x2,  F2  ~  —  *i  is  not  conservative 
(Figure  27). 

6  •  o 

\-o/ 

Figure  27  A  non-potential  field 

Problem.  Is  the  field  in  the  plane  minus  the  origin  given  by  Fj  =  x2/(x\  +  X2), 
f2  =  -xj/(xi  +  X2)  conservative?  Show  that  a  field  is  conservative  if  and 
only  if  its  work  along  any  closed  contour  is  equal  to  zero. 

C  Central  fields 

Definition.  A  vector  field  in  the  plane  E2  is  called  central  with  center  at  0, 
if  it  is  invariant  with  respect  to  the  group  of  motions16  of  the  plane 
which  fix  0. 

16  Including  reflections. 


29 


2:  Investigation  of  the  equations  of  motion 


Problem.  Show  that  all  vectors  of  a  central  field  lie  on  rays  through  0,  and 
that  the  magnitude  of  the  vector  field  at  a  point  depends  only  on  the  distance 
from  the  point  to  the  center  of  the  field. 

It  is  also  useful  to  look  at  central  fields  which  are  not  defined  at  the  point  0. 

Example.  The  newtonian  field  F  =  —  /c(r/ 1  r  J 3)  is  central,  but  the  field  in 
the  problem  in  Section  6B  is  not. 

Theorem.  Every  central  field  is  conservative ,  and  its  potential  energy  depends 
only  on  the  distance  to  the  center  of  the  field,  U  =  U(r). 

Proof.  According  to  the  previous  problem,  we  may  set  F(r)  =  OOOe,., 
where  r  is  the  radius  vector  with  respect  to  0,  r  is  its  length  and  the  unit 
vector  er  =  r/|r|  its  direction.  Then 

pM2  /.r(Af2) 

(F,  dS)  =  <t>(r)dr, 

JjWj 

and  this  integral  is  obviously  independent  of  the  path.  □ 

Problem.  Compute  the  potential  energy  of  the  newtonian  field. 

Remark.  The  definitions  and  theorems  of  this  paragraph  can  be  directly 
carried  over  to  a  euclidean  space  En  of  any  dimension. 


7  Angular  momentum 

We  will  see  later  that  the  invariance  of  an  equation  of  a  mechanical  problem  with  respect  to  some 
group  of  transformations  always  implies  a  conservation  law.  A  central  field  is  invariant  with 
respect  to  the  group  of  rotations.  The  corresponding  first  integral  is  called  the  angular  momen¬ 
tum. 

Definition.  The  motion  of  a  material  point  ( with  unit  mass)  in  a  central  field 
on  a  plane  is  defined  by  the  equation 

r  =  <D(r)er> 

where  r  is  the  radius  vector  beginning  at  the  center  of  the  field  0,  r  is 
its  length,  and  er  its  direction.  We  will  think  of  our  plane  as  lying  in  three- 
dimensional  oriented  euclidean  space. 

Definition.  The  angular  momentum  of  a  material  point  of  unit  mass  relative 
to  the  point  0  is  the  vector  product 

M  =  [r,  r]. 

The  vector  M  is  perpendicular  to  our  plane  and  is  given  by  one  number: 
M  =  M n,  where  n  =  [el5  e2]  is  the  normal  vector,  ej  and  e2  being  an 
oriented  frame  in  the  plane  (Figure  28). 
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7:  Angular  momentum 


Figure  28  Angular  momentum 


Remark .  In  general,  the  moment  of  a  vector  a  “applied  at  the  point  r” 
relative  to  the  point  0  is  [r,  a] ;  for  example,  in  a  school  statics  course  one 
studies  the  moment  of  force.  [The  literal  translation  of  the  Russian  term  for 
angular  momentum  is  “kinetic  moment.”  (Trans,  note)] 

A  The  law  of  conservation  of  angular  momentum 

Lemma.  Let  a  and  b  be  two  vectors  changing  with  time  in  the  oriented  euclidean 
space  R3.  Then 

t  [a,  b]  =  [4,  b]  +  [a,  b], 

dt 

Proof.  This  follows  from  the  definition  of  derivative.  □ 

Theorem  (The  law  of  conservation  of  angular  momentum).  Under  motions 
in  a  central  field ,  the  angular  momentum  M  relative  to  the  center  of  the 
field  0  does  not  change  with  time. 

Proof.  By  definition  M  =  [r,  r].  By  the  lemma,  M  =  [r,  r]  +  [r,  r].  Since 
the  field  is  central  it  is  apparent  from  the  equations  of  motion  that  the  vectors 
r  and  r  are  collinear.  Therefore  M  =  0.  □ 

B  Kepler's  law 

The  law  of  conservation  of  angular  momentum  was  first  discovered  by 
Kepler  through  observation  of  the  motion  of  Mars.  Kepler  formulated  this 
law  in  a  slightly  different  way. 

We  introduce  polar  coordinates  r,  (p  on  our  plane  with  pole  at  the  center 
of  the  field  0.  We  consider,  at  the  point  r  with  coordinates  (|r|  =  r,  (p), 
two  unit  vectors:  er,  directed  along  the  radius  vector  so  that 

r  =  rer, 

and  %,  perpendicular  to  it  in  the  direction  of  increasing  (p.  We  express  the 
velocity  vector  r  in  terms  of  the  basis  er,  (Figure  29). 

Lemma.  We  have  the  relation 


r  =  rer  +  r<pe9. 
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2:  Investigation  of  the  equations  of  motion 


Proof.  Clearly,  the  vectors  er  and  rotate  with  angular  velocity  <p ,  i.e., 

K  =  <P%  e„=-<per. 

Differentiating  the  equality  r  =  rer  gives  us 

r  =  re,  +  re,  =  re,  +  r<pe9.  □ 

Consequently,  the  angular  momentum  is 

M  =  0,  r]  =  [r>  re,]  +  [r,  r<pej  =  r<p[ r,  ej  =  r2<p[e,,  e^]. 

Thus,  the  quantity  M  =  r2<p  is  preserved.  This  quantity  has  a  simple 
geometric  meaning. 


r(t  +  At)  r(t) 


Figure  30  Sectorial  velocity 


Kepler  called  the  rate  of  change  of  the  area  S(t)  swept  out  by  the  radius 
vector  the  sectorial  velocity  C  (Figure  30): 


The  law  discovered  by  Kepler  through  observation  of  the  motion  of  the 
planets  says:  in  equal  times  the  radius  vector  sweeps  out  equal  areas,  so 
that  the  sectorial  velocity  is  constant,  dS/dt  =  const.  This  is  one  formulation 
of  the  law  of  conservation  of  angular  momentum.  Since 

AS  =  S(t  +  At)  -  S(t )  =  ir2<pAt  +  o(At), 
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8 :  Investigation  of  motion  in  a  central  field 


this  means  that  the  sectorial  velocity 

c  =  ^ 

is  half  the  angular  momentum  of  our  point  of  mass  1,  and  therefore  constant. 

Example.  Some  satellites  have  very  elongated  orbits.  By  Kepler’s  law  such 
a  satellite  spends  most  of  its  time  in  the  distant  part  of  its  orbit,  where  the 
magnitude  of  (p  is  small. 


8  Investigation  of  motion  in  a  central  field 

The  law  of  conservation  of  angular  momentum  lets  us  reduce  problems  about  motion  in  a 
central  field  to  problems  with  one  degree  of  freedom.  Thanks  to  this,  motion  in  a  central  field  can 
be  completely  determined. 


A  Reduction  to  a  one-dimensional  problem 

We  look  at  the  motion  of  a  point  (of  mass  1)  in  a  central  field  on  the  plane: 


f 


eu_ 

dr ’ 


U  =  U(r). 


It  is  natural  to  use  polar  coordinates  r,  <p. 

By  the  law  of  conservation  of  angular  momentum  the  quantity  M  = 
<p(t)r2(t)  is  constant  (independent  of  t). 


Theorem.  For  the  motion  of  a  material  point  of  unit  mass  in  a  central  field 
the  distance  from  the  center  of  the  field  varies  in  the  same  way  as  r  varies 
in  the  one-dimensional  problem  with  potential  energy 

M2 

V(r)  =  U(r)  +  i . 


Proof.  Differentiating  the  relation  shown  in  Section  7  (f  =  rer  +  rcpe^), 
we  find 

f  =  (r  -  r<p2)er  +  (2 rq>  +  r<p)e9. 

Since  the  field  is  central, 

dU  _  dU 

dr  dr  *r 

Therefore  the  equation  of  motion  in  polar  coordinates  takes  the  form 

•2  dU  t  ‘  n 

r  —  r</r  =  — —  2r<p  +  rip  =  0. 
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But,  by  the  law  of  conservation  of  angular  momentum, 

.  M 

V  r2  ’ 

where  M  is  a  constant  independent  of  r,  determined  by  the  initial  conditions. 
Therefore, 

dU  M 2  ..  dV  ,  M2 

r  = - - — h  r  —4-  or  r  = - — ,  where  V  =  17  + 


'  '  r4  ~  '  5r  ’  .  '  ~  '  2r2 ' 

The  quantity  V(r)  is  called  the  effective  potential  energy. 


□ 


Remark.  The  total  energy  in  the  derived  one-dimensional  problem 


f2 


Ei  =  j  +  V(r) 

is  the  same  as  the  total  energy  in  the  original  problem 


E  =  -  +  U(r), 


since 


M2 


i2  _  r2  r2(p2  _  f 

J-J*  ~J~h2rJ 


B  Integration  of  the  equation  of  motion 

The  total  energy  in  the  derived  one-dimensional  problem  is  conserved. 
Consequently,  the  dependence  of  r  on  t  is  defined  by  the  quadrature 


r  =  J2 (E  -  V(r))  fdt  =  f -==£==. 

J  J  72 (£  -  V(r)) 

Since  (p  —  M/r2>  d(p/dr  =  {M/r2)jyJl{E  —  V{r)\  and  the  equation  of  the 
orbit  in  polar  coordinates  is  found  by  quadrature, 

f  M/r2  dr 
^  J  y/2(E  -  V(r))' 


C  Investigation  of  the  orbit 

We  fix  the  value  of  the  angular  momentum  at  M.  The  variation  of  r  with  time 
is  easy  to  visualize,  if  one  draws  the  graph  of  the  effective  potential  energy 
V(r)  (Figure  31). 

Let  E  be  the  value  of  the  total  energy.  All  orbits  corresponding  to  the  given 
E  and  M  lie  in  the  region  V(r)  <  E.  On  the  boundary  of  this  region,  V  =  E, 
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V 


Figure  31  Graph  of  the  effective  potential  energy 

i.e.,  r  =  0.  Therefore,  the  velocity  of  the  moving  point,  in  general,  is  not  equal 
to  zero  since  <p  ^  0  for  M  ^  0. 

The  inequality  V(r)  <  E  gives  one  or  several  annular  regions  in  the  plane: 

0  <  rmjn  <  r  <  rmax  <  cc. 

If  0  <  rmin  <  rmax  <  oo,  then  the  motion  is  bounded  and  takes  place  inside 
the  ring  between  the  circles  of  radius  rmjn  and  rmax. 


Figure  32  Orbit  of  a  point  in  a  central  field 

The  shape  of  an  orbit  is  shown  in  Figure  32.  The  angle  tp  varies  mono- 
tonically  while  r  oscillates  periodically  between  rmin  and  rmax.  The  points 
where  r  =  rmin  are  called  pericentral,  and  where  r  =  rmax,  apocentral  (if  the 
center  is  the  earth— perigee  and  apogee;  if  it  is  the  sun — perihelion  and 
aphelion;  if  it  is  the  moon — perilune  and  apolune). 

Each  of  the  rays  leading  from  the  center  to  the  apocenter  or  to  the  peri- 
center  is  an  axis  of  symmetry  of  the  orbit. 

In  general,  the  orbit  is  not  closed:  the  angle  between  the  successive 
pericenters  and  apocenters  is  given  by  the  integral 

_  |*rmax  M/r2  dr 

min  \/2(E  -  Hr))' 

The  angle  between  two  successive  pericenters  is  twice  as  big. 
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2:  Investigation  of  the  equations  of  motion 


Figure  33  Orbit  dense  in  an  annulus 

The  orbit  is  closed  if  the  angle  O  is  commensurable  with  2n,  i.e.,  if  O  = 
2n(m/n),  where  m  and  n  are  integers. 

It  can  be  shown  that  if  the  angle  $  is  not  commensurable  with  2n ,  then  the 
orbit  is  everywhere  dense  in  the  annulus  (Figure  33). 

If  rmin  =  rmax,  i.e.,  E  is  the  value  of  Fat  a  minimum  point,  then  the  annulus 
degenerates  to  a  circle,  which  is  also  the  orbit. 

Problem.  For  which  values  of  a  is  motion  along  a  circular  orbit  in  the  field 
with  potential  energy  U  =  rx,  -  2  <  a  <  oo,  Liapunov  stable? 

Answer.  Only  for  a  =  2. 

For  values  of  £  a  little  larger  than  the  minimum  of  V  the  annulus 
rmin  <  r  <  rmax  will  be  very  narrow,  and  the  orbit  will  be  close  to  a  circle. 
In  the  corresponding  one-dimensional  problem,  r  will  perform  small  oscilla¬ 
tions  close  to  the  minimum  point  of  V 

Problem.  Find  the  angle  for  an  orbit  close  to  the  circle  of  radius  r. 

Hint.  Cf.  Section  D  below. 

We  now  look  at  the  case  rmax  =  oo.  If  limr^x  U(r)  =  limr^.x  V(r)  = 
{/«>  <  oo,  then  it  is  possible  for  orbits  to  go  off  to  infinity.  If  the  initial  energy 
E  is  larger  than  U,  then  the  point  goes  to  infinity  with  finite  velocity  = 
y/2(E  —  Ux).  We  notice  that  if  U(r)  approaches  its  limit  slower  than  r-2, 
then  the  effective  potential  V  will  be  attracting  at  infinity  (here  we  assume  that 
the  potential  U  is  attracting  at  infinity). 

If,  as  r -*  0,  |t/(r)|  does  not  grow  faster  than  M2/2r2,  then  rmin  >  0  and 
the  orbit  never  approaches  the  center.  If,  however,  U(r )  +  ( M2/2r 2)  -*•  -  oo 
as  r  -*■  0,  then  it  is  possible  to  “fall  into  the  center  of  the  field.”  Falling  into 
the  center  of  the  field  is  possible  even  in  finite  time  (for  example,  in  the  field 
U(r)  =  -  1/r3). 

Problem.  Examine  the  shape  of  an  orbit  in  the  case  when  the  total  energy 
is  equal  to  the  value  of  the  effective  energy  V  at  a  local  maximum  point. 
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8 :  Investigation  of  motion  in  a  central  field 


D  Central  fields  in  which  all  bounded  orbits  are 
closed 

It  follows  from  the  following  sequence  of  problems  that  there  are  only  two 
cases  in  which  all  the  bounded  orbits  in  a  central  field  are  closed,  namely, 

U  —  ar2,  a  >  0 


U  =  — ,  k  >  0. 

r 


Problem  1.  Show  that  the  angle  between  the  pericenter  and  apocenter 
is  equal  to  the  semiperiod  of  an  oscillation  in  the  one-dimensional  system 
with  potential  energy  W(x)  =  U(M/x )  +  ( x2/2 ). 

Hint.  The  substitution  x  =  Mjr  gives 


Jxmtn  (E  -  W) 

Problem  2.  Find  the  angle  O  for  an  orbit  close  to  the  circle  of  radius  r. 

Answer.  3>  «  <ficir  =  n(M/r2JV"{r))  =  n^U'/QU'  +  rU"). 

Problem  3.  For  which  values  of  U  is  the  magnitude  of  <Dcir  independent  of  the 
radius  r? 

Answer.  U(r)  =  ar*  (a  >  —2,  a  ^  0)  and  U(r)  =  b  log  r. 

It  follows  that  $>cir  =  n/yJoL  +  2  (the  logarithmic  case  corresponds  to 
a  =  0).  For  example,  for  a  =  2  we  have  5>cir  =  nj 2,  and  for  a  =  —  1  we  have 

<J)  ,  =71 
'*'cir  n. 

Problem  4.  Let  in  the  situation  of  problem  3  U(r) ->oc  as  r->oo.  Find 
lim^OHF,  M). 

Answer,  n/2. 


Hint.  The  substitution  x  =  yxmax  reduces  $  to  the  form 


-f 

V  Vn 


J2(W*(\)  -  W*(y )) 


V2  1  (  M 

W*<y)  =  ^  +  -r-U\ - 

^  -^max  Vy^max 


As  E  -*•  oo  we  have  xmax  -*  oo  and  ymin  -►  0,  and  the  second  term  in  W*  can 
be  discarded. 
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2:  Investigation  of  the  equations  of  motion 


Problem  5.  Let  U(r)  =  —kr  p,  0  <  jS  <  2.  Find  d>0  = 

Answer.  <D0  =  dx/Jx*  -  x2  =  n/{2  -  P).  Note  that  ®0  does  not  depend 
on  A/. 

Problem  6.  Find  all  central  fields  in  which  bounded  orbits  exist  and  are  all 
closed. 


Answer.  V  =  ar2  or  U  =  — k/r . 


Solution.  If  all  bounded  orbits  are  closed,  then,  in  particular,  <Dcir  = 
2 n(m/n)  =  const.  According  to  Problem  3,  U  =  ar“( a  >  —  2),  or  U  =  h  In  r 
(a  =  0).  In  both  cases  <J>cir  =  n/yja.  +  2.  If  a  >  0,  then  according  to  Problem 
4,  limf^^  $>(£,  M)  =  n/2.  Therefore,  Ocir  =  7t/2,  a  =  2.  If  a  <  0,  then 
according  to  Problem  5,  lim£_  _*,<!>(£,  M)  =  nj{2  +  a).  Therefore, 
k/( 2  +  a)  =  n/y/2  +  a,  a  =  - 1.  In  the  case  a  =  0  we  find  3>cir  =  nfj 2, 
which  is  not  commensurable  with  27t.  Therefore,  all  bounded  orbits  can  be 
closed  only  in  fields  where  U  =  ar2  or  U  =  —k/r.  In  the  field  U  =  ar2, 
a  >  0,  all  the  orbits  are  closed  (these  are  ellipses  with  center  at  0,  cf.  Example 
I,  Section  5).  In  the  field  U  —  —k/r  all  bounded  orbits  are  also  closed  and 
also  elliptical,  as  we  will  now  show. 

E  Kepler's  problem 

This  problem  concerns  motion  in  a  central  field  with  potential  U  =  —k/r 
and  therefore  V(r )  =  —(k/r)  +  ( M2/2r 2)  (Figure  34). 

By  the  general  formula 


<P  = 


J 


M/r2  dr 
-j-UE  -  V(r))' 


V 


Figure  34  Effective  potential  of  the  Kepler  problem 
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8.  Investigation  of  motion  in  a  central  field 


Integrating,  we  get 


<P 


—  arc  cos 


M_k 

~r~M 


<2  E  + 


k 2_' 
M2 


To  this  expression  we  should  have  added  an  arbitrary  constant.  We 
will  assume  it  equal  to  zero;  this  is  equivalent  to  the  choice  of  an  origin  of 
reference  for  the  angle  (p  at  the  pericenter.  We  introduce  the  following 
notation : 


M2 


=  P 


,  2  EM2 

1  +  — r^—  -  e. 


k  r  V'  '  k2 

Now  we  get  (p  =  arc  cos  ((p/r)  —  l)/e,  i.e., 


P 

r  =  - - — - . 

1  +  e  cos  <p 

This  is  the  so-called  focal  equation  of  a  conic  section.  The  motion  is  bounded 
(Figure  35)  for  E  <  0.  Then  e  <  1,  i.e.,  the  conic  section  is  an  ellipse.  The 
number  p  is  called  the  parameter  of  the  ellipse,  and  e  the  eccentricity.  Kepler’s 
first  law,  which  he  discovered  by  observing  the  motion  of  Mars,  consists 
of  the  fact  that  the  planets  describe  ellipses,  with  the  sun  at  one  focus. 


If  we  assume  that  the  planets  move  in  a  central  field  of  gravity,  then 
Kepler’s  first  law  implies  Newton’s  law  of  gravity:  U  =  -( k/r )  (cf.  Section 
2D  above). 

The  parameter  and  eccentricity  are  related  with  the  semi-axes  by  the 
formulas 


2a 


P  P  _  2p 

1  —  e  1  +  e  1-e2’ 


i.e., 


P 


1  -  e2 


e  =  cja  —  yfa2  —  b2/a,  where  c  =  ae  is  the  distance  from  the  center  to 
the  focus  (cf.  Figure  35). 
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2:  Investigation  of  the  equations  of  motion 


Remark.  An  ellipse  with  small  eccentricity  is  very  close  to  a  circle.17 
If  the  distance  from  the  focus  to  the  center  is  small  of  first  order,  then  the 

difference  between  the  semi-axes  is  of  second  order:  b  =  a^J  1  —  e2  « 
a(l  -  je2).  For  example,  in  the  ellipse  with  major  semi-axes  of  10  cm  and 
eccentricity  0.1,  the  difference  of  the  semi-axes  is  0.5  mm,  and  the  distance 
between  the  focus  and  the  center  is  1  cm. 

The  eccentricities  of  planets’  orbits  are  very  small.  Therefore,  Kepler 
originally  formulated  his  first  law  as  follows:  the  planets  move  around  the 
sun  in  circles,  but  the  sun  is  not  at  the  center. 

Kepler’s  second  law,  that  the  sectorial  velocity  is  constant,  is  true  in  any 
central  field. 

Kepler’s  third  law  says  that  the  period  of  revolution  around  an  elliptical 
orbit  depends  only  on  the  size  of  the  major  semi-axes. 

The  squares  of  the  revolution  periods  of  two  planets  on  different  elliptical 
orbits  have  the  same  ratio  as  the  cubes  of  their  major  semi-axes.18 

Proof.  We  denote  by  T  the  period  of  revolution  and  by  S  the  area  swept 
out  by  the  radius  vector  in  time  T.  2 S  =  MT,  since  M/2  is  the  sectorial 
velocity.  But  the  area  of  the  ellipse,  S ,  is  equal  to  nab,  so  T  =  Inab/M.  Since 


a  = 


(from  a  =  pf(  1  —  e2)),  and 


M2/k 


2\E\ 


M" 


k 

2\E 


b 


M2 

X 


1 


M 

7W\' 


then  T  =  2u((i/(75|£j)3);  but  2|E|  =  kja,  so  T  =  2m3,2/c“12.  □ 


We  note  that  the  total  energy  E  depends  only  on  the  major  semi-axis  a 
of  the  orbit  and  is  the  same  for  the  whole  set  of  elliptical  orbits,  from  a  circle 
of  radius  a  to  a  line  segment  of  length  2a. 


Problem.  At  the  entry  of  a  satellite  into  a  circular  orbit  at  a  distance  300  km 
from  the  earth  the  direction  of  its  velocity  deviates  from  the  intended  direction 
by  1°  towards  the  earth.  How  is  the  perigee  changed? 

Answer.  The  height  of  the  perigee  is  less  by  approximately  110  km. 

1 7  Let  a  drop  of  tea  fall  into  a  glass  of  tea  close  to  the  center.  The  waves  collect  at  the  symmetric 
point.  The  reason  is  that,  by  the  focal  definition  of  an  ellipse,  waves  radiating  from  one  focus  of 
the  ellipse  collect  at  the  other. 

18  By  planets  we  mean  here  points  in  a  central  field. 
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8:  Investigation  of  motion  in  a  central  field 


Figure  36  An  orbit  which  is  close  to  circular 

Hint.  The  orbit  differs  from  a  circle  only  to  second  order,  and  we  can  dis¬ 
regard  this  difference.  The  radius  has  the  intended  value  since  the  initial 
energy  has  the  intended  value.  Therefore,  we  get  the  true  orbit  (Figure  36) 
by  twisting  the  intended  orbit  through  1°. 

Problem.  How  does  the  height  of  the  perigee  change  if  the  actual  velocity 
is  1  m/sec  less  than  intended? 

Problem.  The  first  cosmic  velocity  is  the  velocity  of  motion  on  a  circular 
orbit  of  radius  close  to  the  radius  of  the  earth.  Find  the  magnitude  of  the 
first  cosmic  velocity  and  show  that  v2  =  yjlvx  (cf.  Section  3B). 

Answer.  8.1  km/sec. 

Problem.19  During  his  walk  in  outer  space,  the  cosmonaut  A.  Leonov  threw 
the  lens  cap  of  his  movie  camera  towards  the  earth.  Describe  the  motion  of 
the  lens  cap  with  respect  to  the  spaceship,  taking  the  velocity  of  the  throw 
as  10  m/sec. 

Answer,  The  lens  cap  will  move  relative  to  the  cosmonaut  approximately 
in  an  ellipse  with  major  axis  about  32  km  and  minor  axis  about  16  km.  The 
center  of  the  ellipse  will  be  situated  16  km  in  front  of  the  cosmonaut  in  his 
orbit,  and  the  period  of  circulation  around  the  ellipse  will  be  equal  to  the 
period  of  motion  around  the  orbit. 

Hint.  We  take  as  our  unit  of  length  the  radius  of  the  space  ship’s  circular 
orbit,  and  we  choose  a  unit  of  time  so  that  the  period  of  revolution  around  this 
orbit  is  2n.  We  must  study  solutions  to  Newton’s  equation 

r 

r  =  --3, 

r3 

close  to  the  circular  solution  with  r0  =  1,  <p0  =  t.  We  seek  those  solutions 
in  the  form 


r  =  r0  +  ri  (p  =  (p0  +  (pl  r,«  l.Pi^l. 

1 9  This  problem  is  taken  from  V.  V.  Beletskii’s  delightful  book.  “  Notes  on  the  Motion  of  Celestial 
Bodies,”  Nauka,  1972. 
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2:  Investigation  of  the  equations  of  motion 


By  the  theorem  on  the  differentiability  of  a  solution  with  respect  to  its 
initial  conditions,  the  functions  rft)  and  (pft)  satisfy  a  system  of  linear 
differential  equations  (equations  of  variation)  up  to  small  amounts  which 
are  of  higher  than  first  order  in  the  initial  deviation. 

By  substituting  the  expressions  for  r  and  (p  in  Newton’s  equation,  we  get, 
after  simple  computation,  the  variational  equations  in  the  form 

fi  —  3rj  +  2  <px  <Pi  =  —2rx. 

After  solving  these  equations  for  the  given  initial  conditions  (r^O)  = 
<^(0)  =  <^j(0)  =  0,  rj(0)  =  -(1/800)),  we  get  the  answer  given  above. 

Disregarding  the  small  quantities  of  second  order  gives  an  effect  of  under 
1/800  of  the  one  obtained  (i.e.,  on  the  order  of  10  meters  on  one  loop). 
Thus  the  lens  cap  describes  a  30  km  ellipse  in  an  hour-and-a-half,  returns 
to  the  space  ship  on  the  side  opposite  the  earth,  and  goes  past  at  the  distance 
of  a  few  tens  of  meters. 

Of  course,  in  this  calculation  we  have  disregarded  the  deviation  of  the  orbit 
from  a  circle,  the  effect  of  forces  other  than  gravity,  etc. 

9  The  motion  of  a  point  in  three-space 

In  this  paragraph  we  define  the  angular  momentum  relative  to  an  axis  and  we  show  that,  for 
motion  in  an  axially  symmetric  field,  it  is  conserved. 

All  the  results  obtained  for  motion  in  a  plane  can  be  easily  carried  over  to  motions  in  space. 


A  Conservative  fields 

We  consider  a  motion  in  the  conservative  field 

dU 


where  U  =  U( r),  re£3. 

The  law  of  conservation  of  energy  holds: 

—  =  0,  where  E  =  \i2  +  U (r). 
dt 


B  Central  fields 

For  motion  in  a  central  field  the  vector  M  =  [r,  r]  does  not  change :  dM/dt  = 

0. 

Every  central  field  is  conservative  (this  is  proved  as  in  the  two-dimensional 
case),  and 

^  =  [f,f]  +  [r,f]  =  0, 

dt 

since  r  =  -(dU/dr),  and  the  vector  dUjdx  is  collinear  with  r  since  the  field  is 
central. 


42 


9:  The  motion  of  a  point  in  three-space 


Corollary.  For  motion  in  a  central  field ,  every  orbit  is  planar. 

Proof.  (M,  r)  =  ([r,  r],  r)  =  0;  therefore  r(r)  _L  M,  and  since  M  =  const., 
all  orbits  lie  in  the  plane  perpendicular  to  M.  2°  □ 

Thus  the  study  of  orbits  in  a  central  field  in  space  reduces  to  the  planar 
problem  examined  in  the  previous  paragraph. 

Problem.  Investigate  motion  in  a  central  field  in  n-dimensional  euclidean 
space. 

C  Axially  symmetric  fields 

Definition.  A  vector  field  in  E 3  has  axial  symmetry  if  it  is  invariant  with 
respect  to  the  group  of  rotations  of  space  which  fix  every  point  of  some 
axis. 

Problem.  Show  that  if  a  field  is  axially  symmetric  and  conservative,  then  its 
potential  energy  has  the  form  U  =  U(r,  z),  where  r,  <p,  and  z  are  cylindrical 
coordinates. 

In  particular,  it  follows  from  this  that  the  vectors  of  the  field  lie  in  planes 
through  the  z  axis. 

As  an  example  of  such  a  field  we  can  take  the  gravitational  field  created 
by  a  solid  of  revolution. 


o' 


Figure  37  Moment  of  the  vector  F  with  respect  to  an  axis 

Let  z  be  the  axis,  oriented  by  the  vector  ez  in  three-dimensional  euclidean 
space  £3;  F  a  vector  in  the  euclidean  linear  space  IR3 ;  0  a  point  on  the  z  axis; 
r  =  x  -  0  g  R3  the  radius  vector  of  the  point  x  e  E3  relative  to  0  (Figure  37). 

Definition.  The  moment  Mz  relative  to  the  z  axis  of  the  vector  F  applied 
at  the  point  r  is  the  projection  onto  the  z  axis  of  the  moment  of  the  vector 
F  relative  to  some  point  on  this  axis : 

Mz  =  (ez,  [r,  F]). 


20  The  case  M  =  0  is  left  to  the  reader. 
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The  number  Mz  does  not  depend  on  the  choice  of  the  point  0  on  the 
z  axis.  In  fact,  if  we  look  at  a  point  0'  on  the  axis,  then  by  properties  of  the 
triple  product,  M'z  =  (ez,  [r',  F])  =  ([ez,  r'],  F)  =  ([ez,  r],  F)  =  Mz. 

Remark.  Mz  depends  on  the  choice  of  the  direction  of  the  z  axis :  if  we  change 
ez  to  -ez,  then  Mz  changes  sign. 

Theorem.  For  a  motion  in  a  conservative  field  with  axial  symmetry  around  the 
z  axis,  the  moment  of  velocity  relative  to  the  z  axis  is  conserved. 

Proof.  Mz  =  (ez,  [r,  r]).  Since  ir  =  F,  it  follows  that  r  and  r  lie  in  a  plane 
passing  through  the  z  axis,  and  therefore  [r,  r ]  is  perpendicular  to  ez . 
Therefore, 

Mz  =  (ez,  [r,  r])  +  (ez,  [r,  r])  =  0.  □ 

Remark.  This  proof  works  for  any  force  field  in  which  the  force  vector  F 
lies  in  the  plane  spanned  by  r  and  ez . 

10  Motions  of  a  system  of  n  points 

In  this  paragraph  we  prove  the  laws  of  conservation  of  energy,  momentum,  and  angular  momen¬ 
tum  for  systems  of  material  points  in  £3. 

A  Internal  and  external  forces 

Newton’s  equations  for  the  motion  of  a  system  of  n  material  points,  with 
masses  mt  and  radius  vectors  rf  e  E3  are  the  equations 

m^F,,  i=  1,2,. ...n. 

The  vector  F,  is  called  the  force  acting  on  the  i-th  point. 

The  forces  F,  are  determined  experimentally.  We  often  observe  in  a 
system  that  for  two  points  these  forces  are  equal  in  magnitude  and  act 
in  opposite  directions  along  the  straight  line  joining  the  points  (Figure  38). 


Figure  38  Forces  of  interaction 


Such  forces  are  called  forces  of  interaction  (example :  the  force  of  universal 
gravitation). 

If  all  forces  acting  on  a  point  of  the  system  are  forces  of  interaction,  then 
the  system  is  said  to  be  closed.  By  definition,  the  force  acting  on  the  i-th 
point  of  a  closed  system  is 

r,  =  t  f«- 

j=i 

j*> 
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The  vector  F0  is  the  force  with  which  the  y'-th  point  acts  on  the  i-th. 

Since  the  forces  F0  and  Fj;  are  opposite  (F0  =  -  FJ(),  we  can  write  them 
in  the  form  F0  =  fifa,  where  fj  =  is  the  magnitude  of  the  force  and  eu 
is  the  unit  vector  in  the  direction  from  the  i-th  point  to  the  y-th  point. 

If  the  system  is  not  closed,  then  it  is  often  possible  to  represent  the  forces 
acting  on  it  in  the  form 

Fi  =  I  Fy  +  f;, 

where  Fi;  are  forces  of  interaction  and  F;(r()  is  the  so-called  external  force. 


Figure  39  Internal  and  external  forces 


Example.  (Figure  39)  We  separate  a  closed  system  into  two  parts,  I  and  II. 
The  force  F,  applied  to  the  i-th  point  of  system  I  is  determined  by  forces  of 
interaction  inside  system  I  and  forces  acting  on  the  i-th  point  from  points 
of  system  II,  i.e., 

F,  =  £  Fy  +  f;, 

jel 

j*i 

F;  is  the  external  force  with  respect  to  system  I. 

B  The  law  of  conservation  of  momentum 
Definition.  The  momentum  of  a  system  is  the  vector 

P  = 

i  =  1 


Theorem.  The  rate  of  change  of  momentum  of  a  system  is  equal  to  the  sum 
of  all  external  forces  acting  on  points  of  the  system. 


Proof.  dP/dt  =  £*=  i  m.  r,-  =  £?= ,  F,-  = 

0,  since  for  forces  of  interaction  Fiy  =  -  Fj( . 


Fy  +  LF:  =  I.r;;I,.jFw  = 

□ 


Corollary  I .  The  momentum  of  a  closed  system  is  conserved. 


Corollary  2.  If  the  sum  of  the  exterior  forces  acting  on  a  system  is  perpendicular 
to  the  x  axis ,  then  the  projection  Px  of  the  momentum  onto  the  x  axis  is 
conserved:  Px  =  const. 


45 


2:  Investigation  of  the  equations  of  motion 


Definition.  The  center  of  mass  of  a  system  is  the  point 

I  Wiri 

F~  2>,  ' 

Problem.  Show  that  the  center  of  mass  is  well  defined,  i.e.,  does  not  depend 
on  the  choice  of  the  origin  of  reference  for  radius  vectors. 

The  momentum  of  a  system  is  equal  to  the  momentum  of  a  particle  lying  at 
the  center  of  mass  of  the  system  and  having  mass  £  m,-. 

In  fact,  (£  mjr  =  £  r;),  from  which  it  follows  that  (£  m*)r  =  £  m.r,.. 

We  can  now  formulate  the  theorem  about  momentum  as  a  theorem  about 
the  motion  of  the  center  of  mass. 

Theorem.  The  center  of  mass  of  a  system  moves  as  if  all  masses  were  concen¬ 
trated  at  it  and  all  forces  were  applied  to  it. 

Proof.  (£  mf)r  =  P.  Therefore,  (£  m,)f  =  dP/dt  =  F<-  D 

Corollary.  If  a  system  is  closed,  then  its  center  of  mass  moves  uniformly 
and  linearly. 

C  The  law  of  conservation  of  angular  momentum 

Definition.  The  angular  momentum  of  a  material  point  of  mass  m  relative  to  the 
point  0,  is  the  moment  of  the  momentum  vector  relative  to  0: 

M  =  [r,  mr]. 

The  angular  momentum  of  a  system  relative  toO  is  the  sum  of  the  angular 
momenta  of  all  the  points  in  the  system : 

n 

M=  £  |>;,  M.]- 

i=  i 


Theorem.  The  rate  of  change  of  the  angular  momentum  of  a  system  is  equal 
to  the  sum  of  the  moments  of  the  external  forces 21  acting  on  the  points  of 
the  system. 


Proof.  dM/dt  =  [r,,  m.f.l  +  £"=i  Cro  Mil  The  first  tern  is  equal 

to  zero,  and  the  second  is  equal  to 


I  Dr„  F;]  =  I 


>•=  i 


i  =  1 


+  f; 


I  [r„  FA 


i  =  1 


by  Newton’s  equations. 


2 1  The  moment  of  force  is  also  called  the  torque  [Trans,  note]. 
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10:  Motions  of  a  system  of  n  points 


The  sum  of  the  moments  of  two  forces  of  interaction  is  equal  to  zero  since 
Fu=  ~  FJi>  so  F.y]  +  Fjil  =  [(T  -  r,X  Fi7]  =  0. 

Therefore,  the  sum  of  the  moments  of  all  forces  of  interaction  is  equal 
to  zero : 


Z 


1*0 


=  0. 


Therefore,  dM/dt  =  £"= ,  [r,  ,  FJ].  □ 

Corollary  1  (The  law  of  conservation  of  angular  momentum).  If  the  system 
is  closed,  then  M  =  const. 

We  denote  the  sum  of  the  moments  of  the  external  forces  by  N  = 

B-1  [r„Fa. 

Then,  by  the  theorem  above,  dM/dt  =  N,  from  which  we  have 


Corollary  2.  If  the  moment  of  the  external  forces  relative  to  the  z  axis  is 
equal  to  zero,  then  M2  is  constant. 

D  The  law  of  conservation  of  energy 
Definition.  The  kinetic  energy  of  a  point  of  mass  m  is 


T  = 


mr 


2  ' 


Definition.  The  kinetic  energy  of  a  system  of  mass  points  is  the  sum  of  the 
kinetic  energies  of  the  points: 

y  mjf 

^  —  Zj  o’ 

1=1  Z 

where  the  mt  are  the  masses  of  the  points  and  r,  are  their  velocities. 


Theorem.  The  increase  in  the  kinetic  energy  of  a  system  is  equal  to  the  sum  of 
the  work  of  all  forces  acting  on  the  points  of  the  system. 

Proof, 

AT  »  r  » 

-7-  =  Z  mfa,  ft)  =  X  (r,-,  mtfd  -  Z  (*o  Ft)- 
at  l  =  1  ;  =  i  i  =  i 

Therefore, 

f  dT  "  " 

m  -  nto)  =  -rr*  =  I  (f„  f »  =  %  a,.  □ 

•'to  at  i=  1  Jto  i=  1 
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2:  Investigation  of  the  equations  of  motion 


The  configuration  space  of  a  system  of  n  mass  points  in  E3  is  the  direct 
product  of  n  euclidean  spaces:  E3n  —  E3  x  •  •  •  x  £3.  It  has  itself  the  structure 
of  a  euclidean  space. 

Let  r  =  (rt, . . . ,  r„)  be  the  radius  vector  of  a  point  in  the  configuration 
space,  and  F  =  (Ft, . . . ,  F„)  the  force  vector.  We  can  write  the  theorem  above 
in  the  form 


T(fi)  -  T(t0)  = 


pr(t  i ) 

(F,  dr)  = 

4  r(*o)  J  ti 


1 1 

to 


(t  F )dt. 


In  other  words: 

The  increase  in  kinetic  energy  is  equal  to  the  work  of  the  “force”  F 
on  the  “path”  r(r)  in  configuration  space. 

Definition.  A  system  is  called  conservative  if  the  forces  depend  only  on  the 
location  of  a  point  in  the  system  (F  =  F(r)),  and  if  the  work  of  F  along 
any  path  depends  only  on  the  initial  and  final  points  of  the  path: 

fMl 

(F,  dr)  =  d>(Mb  M2). 

*  M  i 


Theorem.  For  a  system  to  be  conservative  it  is  necessary  and  sufficient  that 
there  exist  a  potential  energy,  i.e.,  a  function  U( r)  such  that 


F 


dU_ 

~dr' 


Proof.  Cf.  Section  6B.  □ 

Theorem.  The  total  energy  of  a  conservative  system  (£  =  T  +  U)  is  preserved 
under  the  motion:  Eiti)  =  E(t0). 

Proof.  By  what  was  shown  earlier, 

Tit,)  -  T(t0 )  =  f  (  °(F,  dr)  =  U(r(t0))  -  U( r(fl)).  □ 

*'r(«o) 

Let  all  the  forces  acting  on  the  points  of  a  system  be  divided  into  forces  of 
interaction  and  external  forces: 

f,  =  I  f„  +  f;, 

i*j 


where  Fy  =  —  F jt  -  fueu. 


Proposition.  If  the  forces  of  interaction  depend  only  on  distance,  fu  = 
fijilti  —  rjl),  then  they  are  conservative. 
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10:  Motions  of  a  system  of  n  points 


Proof.  If  a  system  consists  entirely  of  two  points  i  and  j,  then,  as  is  easily 
seen,  the  potential  energy  of  the  interaction  is  given  by  the  formula 


We  then  have 


dl7v(|r,-r,|)  d|r,-r,| 

dr,  Jij  drt 

Therefore,  the  potential  energy  of  the  interaction  of  all  the  points  will  be 

U(r)  =  I  -  r,|).  □ 


If  the  external  forces  are  also  conservative,  i.e.,  FJ  =  -(dliyd r,),  then 
the  system  is  conservative,  and  its  total  potential  energy  is 

V(r)  =  £  Ujj  +  £  V\. 

i>j  i 

For  such  a  system  the  total  mechanical  energy 

E=T+U=Y,i+  ZVu+I >; 

i  Z  i  >  j  i 

is  conserved. 

If  the  system  is  not  conservative,  then  the  total  mechanical  energy  is  not 
generally  conserved. 


Definition.  A  decrease  in  the  mechanical  energy  E(t0)  -  E(t  i)  is  called  an 
increase  in  the  non-mechanical  energy  E'  : 

E'(tl)  -  E'(t0 )  =  E(t0)  -  E(tl). 

Theorem  (The  law  of  conservation  of  energy).  The  total  energy  H  =  E  -F  E' 
is  conserved. 

This  theorem  is  an  obvious  corollary  of  the  definition  above.  Its  value  lies 
in  the  fact  that  in  concrete  physical  systems,  expressions  for  the  size  of  the 
non-mechanical  energy  can  be  found  in  terms  of  other  physical  quantities 
(temperature,  etc.). 

E  Example :  The  two-body  problem 

Suppose  that  two  points  with  masses  mi  and  m2  interact  with  potential  U, 
so  that  the  equations  of  motion  have  the  form 

dU  dU 

miXl=~~dr[  T/  =  C/((ri  —  r2 1). 
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2:  Investigation  of  the  equations  of  motion 


Theorem.  The  time  variation  of  r  =  ty  —  t2  in  the  two-body  problem  is  the 
same  as  that  for  the  motion  of  a  point  of  mass  m  =  mxm2jfmx  +  m2 )  in  a 
field  with  potential  C/(  |  r  | ). 

We  denote  by  r0  the  radius  vector  of  the  center  of  mass:  r0  = 
(w  t  Tj  +  m2r2)/(m1  +  m2).  By  the  theorem  on  the  conservation  of  momentum, 
the  point  r0  moves  uniformly  and  linearly. 

We  now  look  at  the  vector  r  =  Tj  —  r2.  Multiplying  the  first  of  the 
equations  of  motion  by  m2,  the  second  by  mu  and  computing,  we  find  that 
nt1m2i:  =  ~(ml  +  m2)(dU/d r),  where  U  =  C/dr!  —  r2|)  =  C/(|r|). 

In  particular,  in  the  case  of  a  Newtonian  attraction,  the  points  describe 
conic  sections  with  foci  at  their  common  center  of  mass  (Figure  40). 


Figure  40  The  two  body  problem 


Problem.  Determine  the  major  semi-axis  of  the  ellipse  which  the  center  of 
the  earth  describes  around  the  common  center  of  mass  of  the  earth  and  the 
moon.  Where  is  this  center  of  mass,  inside  the  earth  or  outside?  (The  mass 
of  the  moon  is  1/81  times  the  mass  of  the  earth.) 

1 1  The  method  of  similarity 

In  some  cases  it  is  possible  to  obtain  important  information  from  the  form  of  the  equations  of 
motion  without  solving  them,  by  using  the  methods  of  similarity  and  dimension.  The  main  idea 
in  these  methods  is  to  choose  a  change  of  scale  (of  time,  length,  mass,  etc.)  under  which  the 
equations  of  motion  preserve  their  form. 

A  Example 

Let  r(r)  satisfy  the  equation  m(d2r/dt2)  =  —(dU/dr).  We  set  tx  =  cut  and 
mr  =  a 2m.  Then  r(tj)  satisfies  the  equation  mx  -(d2r/dtl)  =  —  (dU/dr),  In 
other  words: 

If  the  mass  of  a  point  is  decreased  by  a  factor  of  4,  then  the  point  can  travel 
the  same  orbit  in  the  same  force  field  twice  as  fast.22 

22  Here  we  are  assuming  that  V  does  not  depend  on  m.  In  the  field  of  gravity,  the  potential 
energy  U  is  proportional  to  m,  and  therefore  the  acceleration  does  not  depend  on  the  mass  m 
of  the  moving  point. 
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1 1 :  The  method  of  similarity 


B  A  problem 

Suppose  that  the  potential  energy  of  a  central  field  is  a  homogeneous  function 
of  degree  v : 

U(ctr)  =  txvU(r)  for  any  a  >  0. 

Show  that  if  a  curve  y  is  the  orbit  of  a  motion,  then  the  homothetic 
curve  ay  is  also  an  orbit  (under  the  appropriate  initial  conditions).  Determine 
the  ratio  of  the  circulation  times  along  these  orbits.  Deduce  from  this  the 
isochronicity  of  the  oscillation  of  a  pendulum  (v  =  2)  and  Kepler’s  third  law 
(v  =  -1). 

Problem.  If  the  radius  of  a  planet  is  a  times  the  radius  of  the  earth  and  its 
mass  /?  times  that  of  the  earth,  find  the  ratio  of  the  acceleration  of  the  force 
of  gravity  and  the  first  and  second  cosmic  velocities  to  the  corresponding 
quantities  for  the  earth. 

Answer,  y  =  /fa-2,  <5  =  y/pjoi. 

For  the  moon,  for  example,  a  =  1/3.7  and  /?  =  1/81.  Therefore,  the  accel¬ 
eration  of  gravity  is  about  1/6  that  of  the  earth  (y  %  1/6),  and  the  cosmic 
velocities  are  about  1/5  those  for  the  earth  (<5  «  1/4.7). 

Problem.23  A  desert  animal  has  to  cover  great  distances  between  sources  of 
water.  How  does  the  maximal  time  the  animal  can  run  depend  on  the  size 
L  of  the  animal? 

Answer.  It  is  directly  proportional  to  L. 

Solution.  The  store  of  water  is  proportional  to  the  volume  of  the  body, 
i.e.,  L3 ;  the  evaporation  is  proportional  to  the  surface  area,  i.e.,  L2.  Therefore, 
the  maximal  time  of  a  run  from  one  source  to  another  is  directly  proportional 
to  L. 

We  notice  that  the  maximal  distance  an  animal  can  run  also  grows 
proportionally  to  L  (cf.  the  following  problem). 

Problem.24  How  does  the  running  velocity  of  an  animal  on  level  ground 
and  uphill  depend  on  the  size  L  of  the  animal? 

Answer.  On  level  ground  ^  L°,  uphill  ~  L~ l. 

23  J.  M.  Smith,  Mathematical  Ideas  in  Biology.  Cambridge  University  Press,  1968. 

24  Ibid. 
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2:  Investigation  of  the  equations  of  motion 


Solution.  The  power  developed  by  the  animal  is  proportional  to  L2 
(the  percentage  used  by  muscle  is  constant  at  about  25  %,  the  other  75  %  of 
the  chemical  energy  is  converted  to  heat;  the  heat  output  is  proportional 
to  the  body  surface,  i.e.,  L2,  which  means  that  the  effective  power  is  pro¬ 
portional  to  L2). 

The  force  of  air  resistance  is  directly  proportional  to  the  square  of  the 
velocity  and  the  area  of  a  cross-section;  the  power  spent  on  overcoming 
it  is  therefore  proportional  to  v2L2v.  Therefore,  v3L2  ~  L2,  so  t;  ~  L°.  In 
fact,  the  running  velocity  on  level  ground,  no  smaller  for  a  rabbit  than  for 
a  horse,  in  practice  does  not  specifically  depend  on  the  size. 

The  power  necessary  to  run  uphill  is  mgv  ~  L3v;  since  the  generated  power 
is  ~  L2,  we  find  that  v  ~  L1.  In  fact,  a  dog  easily  runs  up  a  hill,  while  a 
horse  slows  its  pace. 

Problem.24"  How  does  the  height  of  an  animal’s  jump  depend  on  its  size? 
Answer  i"  w  f  ^ . 

Solution.  For  a  jump  of  height  h  one  needs  energy  proportional  to  L3/?, 
and  the  work  accomplished  by  muscular  strength  F  is  proportional  to  FL. 
The  force  F  is  proportional  to  L2  (since  the  strength  of  bones  is  proportional 
to  their  section).  Therefore,  L3h  ~  L2L,  i.e.,  the  height  of  a  jump  does  not 
depend  on  the  size  of  the  animal.  In  fact,  a  jerboa  and  a  kangaroo  can  jump 
to  approximately  the  same  height. 


240  Ibid. 
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PART  II 

LAGRANGIAN  MECHANICS 


Lagrangian  mechanics  describes  motion  in  a  mechanical  system  by  means  of 
the  configuration  space.  The  configuration  space  of  a  mechanical  system  has 
the  structure  of  a  differentiable  manifold,  on  which  its  group  of  diffeo- 
morphisms  acts.  The  basic  ideas  and  theorems  of  lagrangian  mechanics  are 
invariant  under  this  group,25  even  if  formulated  in  terms  of  local  coordinates. 

A  lagrangian  mechanical  system  is  given  by  a  manifold  (“configuration 
space”)  and  a  function  on  its  tangent  bundle  (“the  lagrangian  function”). 

Every  one-parameter  group  of  diffeomorphisms  of  configuration  space 
which  fixes  the  lagrangian  function  defines  a  conservation  law  (i.e.,  a  first 
integral  of  the  equations  of  motion). 

A  newtonian  potential  system  is  a  particular  case  of  a  lagrangian  system 
(the  configuration  space  in  this  case  is  euclidean,  and  the  lagrangian  function 
is  the  difference  between  the  kinetic  and  potential  energies). 

The  lagrangian  point  of  view  allows  us  to  solve  completely  a  series  of 
important  mechanical  problems,  including  problems  in  the  theory  of  small 
oscillations  and  in  the  dynamics  of  a  rigid  body. 


25  And  even  under  larger  groups  of  transformations,  which  also  affect  time. 


Variational  principles 


In  this  chapter  we  show  that  the  motions  of  a  newtonian  potential  system 
are  extremals  of  a  variational  principle,  “Hamilton’s  principle  of  least 
action.” 

This  fact  has  many  important  consequences,  including  a  quick  method 
for  writing  equations  of  motion  in  curvilinear  coordinate  systems,  and  a 
series  of  qualitative  deductions— for  example,  a  theorem  on  returning  to  a 
neighborhood  of  the  initial  point. 

In  this  chapter  we  will  use  an  n-dimensional  coordinate  space.  A  vector 
in  such  a  space  is  a  set  of  numbers  x  =  (xl5 . . . ,  x„).  Similarly,  df/dx  means 
(df/dxu  df/dxn\  and  (a,  b)  =  aj)x  +  •  •  •  +  anb„. 

12  Calculus  of  variations 

For  what  follows,  we  will  need  some  facts  from  the  calculus  of  variations.  A  more  detailed 
exposition  can  be  found  in  “A  Course  in  the  Calculus  of  Variations”  by  M.  A.  Lavrentiev  and 
L.  A.  Lusternik,  M.  L.,  1938,  or  G.  E.  Shilov,  “Elementary  Functional  Analysis,”  MIT  Press, 
1974. 

The  calculus  of  variations  is  concerned  with  the  extremals  of  functions 
whose  domain  is  an  infinite-dimensional  space:  the  space  of  curves.  Such 
functions  are  called  functionals. 

An  example  of  a  functional  is  the  length  of  a  curve  in  the  euclidean  plane: 
if  y  =  {(f,  x):  x(t)  =  x,t0  <  t  <  tj,  then  <£(y)  =  £  +  x2  dt . 

In  general,  a  functional  is  any  mapping  from  the  space  of  curves  to  the 
real  numbers. 

We  consider  an  “approximation”  /  to  y,  y'  =  {(t,  x):  x  =  x(t)  +  h(t)}. 
We  will  call  it  y'  =  y  +  h.  Consider  the  increment  of  <D(y  +  h)  —  d>(y) 
(Figure  41). 
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3:  Variational  principles 


x 


Figure  41  Variation  of  a  curve 


A  Variations 

Definition.  A  functional  <I>  is  called  differentiable26  if  d>(y  +  h)  —  d>(}>)  = 
F  +  R,  where  F  depends  linearly  on  h  (i.e.,  for  a  fixed  y,  Fihi  +  h2)  = 
Ff/iO  +  F(h2)  and  F(ch)  =  cF(h)),  and  R(h,  y)  =  O (h2)  in  the  sense  that, 
for  |h|  <  £  and  \dh/dt  \  <  s,  we  have  \R\  <  Ce2.  The  linear  part  of  the 
increment,  F(h),  is  called  the  differential. 


It  can  be  shown  that  if  <I>  is  differentiable,  its  differential  is  uniquely 
defined.  The  differential  of  a  functional  is  also  called  its  variation,  and  h  is 
called  a  variation  of  the  curve. 


Example.  Let  y  —  {(f,  x):  x  =  x(t),  t0  <  t  <  ttj  be  a  curve  in  the  (t,  x)-plane; 
x  =  dx/dt;  L  =  L(a,  b,  c)  a  differentiable  function  of  three  variables.  We 
define  a  functional  <1>  by 

L(x(t),  x(t),  t)dt 

O 

In  case  L  =  ^/l  +  b2,  we  get  the  length  of  y. 


Theorem.  The  functional  ^(y)  =  J|*  L(x,  x,  t)dt  is  differentiable,  and  its 
derivative  is  given  by  the  formula 


Proof. 


rtl  [dL  d  dL 

m  =  l 

-f 


dLV 

hdt  +[  —  h 

\dx  ) 


*o 


$(y  +  h)  -  0(y)  =  [L(x  +  h,  x  +  /z,  t)  —  L(x,  x,  ty]dt 


dL  ,  dL  ' 

-r-h  + 
dx  ox 


dt  +  O  (h2)  =  F(h)  +  R, 


26  We  should  specify  the  class  of  curves  on  which  <1>  is  defined  and  the  linear  space  which  con¬ 
tains  h.  One  could  assume,  for  example,  that  both  spaces  consist  of  the  infinitely  differentiable 
functions. 
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12:  Calculus  of  variations 


where 


+  and  *“°('‘2) 


Integrating  by  parts,  we  find  that 


fr‘5L,  Clud(dL\J  LdL\" 
io  di  1  Jr0  dt  (dx)  1  +  (  dx)  t0' 


B  Extremals 


Definition.  An  extremal  of  a  differentiable  functional  0(y)  is  a  curve  y  silch  that 
F(h)  =  0  for  all  h. 

(In  exactly  the  same  way  that  y  is  a  stationary  point  of  a  function  if  the 
differential  is  equal  to  zero  at  that  point.) 

Theorem.  The  curve  y:x  —  x(t)  is  an  extremal  of  the  functional  <J>(y)  = 
J{‘  L(x,  x,  t)dt  on  the  space  of  curves  passing  through  the  points  x(t0 )  =  x0 
and  x(fi)  —  X[,  if  and  only  if 


d_  (dL\  _  dL  _ 
dt  \dxj  dx 


0  along  the  curve  x(t)- 


Lemma.  If  a  continuous  function  /(r),  t0  <  t  <  1 1  satisfies  / (t)h(t)dt  =  0 

for  any  continuous21  function  h(t)  with  h(t0 )  =  h(t j)  =  0,  thenf(t)  =  0. 


Figure  42  Construction  of  the  function  h 

Proof  of  the  lemma.  Let  f(t*)  >  0  for  some  r*,  t0  <  t*  <  tv  Since  /  is 
continuous,  f(t)  >  c  in  some  neighborhood  A  of  the  point  t*:t0<t*  — 
d  <  t  <  t*  +  d  <  f Let  h(t)  be  such  that  h(t)  =  0  outside  A,  h(t)  >  0  in  A, 
and  h(t)  =  1  in  A/2  (i.e.,  for  t  s.t.  t*  —  %d  <  t  <  t*  4-  \d).  Then,  clearly, 
P,of(t)h(t)  >  dc  >  0  (Figure  42).  This  contradiction  shows  that  f(t*)  =  0 
for  all  t*,  t0  <  t*  <  fj.  □ 

Proof  of  the  theorem.  By  the  preceding  theorem, 


f1  (dL\  dLl ,  , 

<)  =  _1  L*  («)"&]** 


Or  even  for  any  infinitely  differentiable  function  h. 
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The  term  after  the  integral  is  equal  to  zero  since  h(t0 )  =  h(t i)  =  0.  If  y  is  an 
extremal,  then  F(h )  =  0  for  all  h  with  h(t0)  —  h(tt)  —  0.  Therefore, 

f(t)h(t)dt  =  0, 

•'to 

where 


m  = 


d_  (dL 
dt  \dx 


dL 

fa' 


for  all  such  h.  By  the  lemma, /(r)  =  0.  Conversely,  if  f(t )  =  0,  then  clearly 
F(h)  =  0.  □ 


Example.  We  verify  that  the  extremals  of  length  are  straight  lines.  We  have: 


L  =  '/iT1P  Tx  =  a 


V1  + 


dL  _ 
fa  ~ 

x  =  c 


d 


y/1  +  X2  dt\^/l  +  x 2 

i  x  =  c1r  +  c2. 


=  0 


C  The  Euler-Lagrange  equation 
Definition.  The  equation 

d_  I dL 
dt 

is  called  the  Euler-Lagrange  equation  for  the  functional 

<I>  =  |*  L(x,  x,  t)dt. 

•'to 


Now  let  x  be  a  vector  in  the  /i-dimensional  coordinate  space  R",  y  = 
{ (r,  x):  x  =  x(t),  t0  <  t  <  tj}  a  curve  in  the  {n  +  l)-dimensional  space 
R  x  R",  and  L:B"  x  I"  x  H->ia  function  of  2  n  +  1  variables.  As  before, 
we  show: 

Theorem.  The  curve  y  is  an  extremal  of  the  functional  d>(y)  —  L(x,  x,  t)dt 
on  the  space  of  curves  joining  (t0,  x0)  and  (rl5  x^,  if  and  only  if  the  Euler- 
Lagrange  equation  is  satisfied  along  y. 

This  is  a  system  of  n  second- order  equations,  and  the  solution  depends  on 
In  arbitrary  constants.  The  2 n  conditions  x(t0)  =  x0,  x(rj)  =  xt  are  used 
for  finding  them. 


Problem.  Cite  examples  where  there  are  many  extremals  connecting  two 
given  points,  and  others  where  there  are  none  at  all. 
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D  An  important  remark 

The  condition  for  a  curve  y  to  be  an  extremal  of  a  functional  does  not  depend 
on  the  choice  of  coordinate  system. 

For  example,  the  same  functional— length  of  a  curve— is  given  in  cartesian 
and  polar  coordinates  by  the  different  formulas 


r‘  i 


to 


A 


+  X 


2  dt 


r2  +  r2<p2  dt. 


The  extremals  are  the  same— straight  lines  in  the  plane.  The  equations  of 
lines  in  cartesian  and  polar  coordinates  are  given  by  different  functions: 
Xi  =  xft),  x2  =  x2(t),  and  r  =  f(t),  (p  =  q>{t). 

However,  both  these  vector  functions  satisfy  the  Euler- Lagrange 
equation 


d_dL  _  dL  _ 
dt  dx  dx 

only,  in  the  first  case,  when  xcart  =  x1?  xz  and  Lcart  =  ^Jx\  +  x2,  and  in 

the  second  case  when  x^i  =  r,  (p  and  Lpol  =  ^fr2  +  r2(p2. 

In  this  way  we  can  easily  describe  in  any  coordinates  a  differential  equa¬ 
tion  for  the  family  of  all  straight  lines. 


Problem.  Find  the  differential  equation  for  the  family  of  all  straight  lines 
in  the  plane  in  polar  coordinates. 


1 3  Lagrange’s  equations 

Here  we  indicate  the  variational  principle  whose  extremals  are  solutions  of  Newton’s  equations 
of  motion  in  a  potential  system. 

We  compare  Newton’s  equations  of  dynamics 

d  dV 

(1)  —  K-r,)  +  — =  0 

dt  or i 

with  the  Euler-Lagrange  equation 

d  dL  dL  _ 
dt  dx  dx 

A  Hamilton's  principle  of  least  action 

Theorem.  Motions  of  the  mechanical  system  (1)  coincide  with  extremals  of 
the  functional 

L  dt,  where  L  =  T  —  U 

0 

is  the  difference  between  the  kinetic  and  potential  energy. 
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Proof.  Since  U  =  U(r)  and  T  =  £  m;  rf/2,  we  have  dL/8 r,  =  dT/dii  =  fj 
and  dL/<3r;  =  —  dU/dTf.  □ 

Corollary.  Let  (qu  . . . ,  q3n)  be  any  coordinates  in  the  configuration  space  of 
a  system  of  n  mass  points.  Then  the  evolution  of  q  with  time  is  subject  to  the 
Euler- Lagrange  equations 

d  ( 8L\  dL 

*U)-af  =  0>  = 

Proof.  By  the  theorem  above,  a  motion  is  an  extremal  of  the  functional 
J  L  dt.  Therefore,  in  any  system  of  coordinates  the  Euler-Lagrange  equation 
written  in  that  coordinate  system  is  satisfied.  □ 

Definition.  In  mechanics  we  use  the  following  terminology:  L(q,q,  t)=T  —  U 
is  the  Lagrange  function  or  lagrangian,  qt  are  the  generalized  coordinates, 
qi  are  generalized  velocities,  8L/dq,  =  p;  are  generalized  momenta, 
dL/dqi  are  generalized  forces,  L( q,q,  t)dt  is  the  action,  ( d(dL/dqj)/dt ) 

—  (dL/dqi)  —  0  are  Lagrange’s  equations. 

The  last  theorem  is  called  “Hamilton’s  form  of  the  principle  of  least 
motion”  because  in  many  cases  the  action  q(t)  is  not  only  an  extremal  but 
is  also  a  minimum  value  of  the  action  functional  jj’  L  dt. 

B  The  simplest  examples 
Example  1.  For  a  free  mass  point  in  E3, 


in  cartesian  coordinates  qt  =  r,  we  find 

L  =  j  (q\  +  q22  +  qj). 

Here  the  generalized  velocities  are  the  components  of  the  velocity  vector, 
che  generalized  momenta  pt  =  mq(  are  the  components  of  the  momentum 
vector,  and  Lagrange’s  equations  coincide  with  Newton’s  equations 
dp/dt  =  0.  The  extremals  are  straight  lines.  It  follows  from  Hamilton’s 
principle  that  straight  lines  are  not  only  shortest  (i.e.,  extremals  of  the  length 

jti  >j4l  +  42  +  ^3  dt)  but  also  extremals  of  the  action  (qj  +  q\  +  q\)dt. 
Problem.  Show  that  this  extremum  is  a  minimum. 

Example  2.  We  consider  planar  motion  in  a  central  field  in  polar  coordinates 
Qi  =  r,  q2  =  <P-  From  the  relation  r  =  rer  +  cpre^  we  find  the  kinetic  energy 
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T  =  $mr2  =  jm(r2  +  r202)  and  the  lagrangian  L(q,  q)  =  T(q,  4)  -  l/(q), 
where  U  —  U(q  j). 

The  generalized  momenta  will  be  p  =  dL/dq,  i.e., 

pi  =  mr  p2  —  mr2q>. 

The  first  Lagrange  equation  p j  =  dL/dq j  takes  the  form 

.2  dU 
mr  =  mrcp  — — . 

dr 

We  already  obtained  this  equation  in  Section  8. 

Since  q2  =  0  does  not  enter  into  L,  we  have  dL/dq2  =  0.  Therefore,  the 
second  Lagrange  equation  will  be  p2  =  0,  p2  =  const.  This  is  the  law  of 
conservation  of  angular  momentum. 

In  general,  when  the  field  is  not  central  ( U  =  U(r ,  (p)),  we  find  p2  = 

—  dUjdip. 

This  equation  can  be  rewritten  in  the  form  d(M,  e2)/dt  =  N,  where 
M  =  ([r,  F],  ez)  and  F  =  - dU/dr .  (The  rate  of  change  in  angular  momentum 
relative  to  the  z  axis  is  equal  to  the  moment  of  the  force  F  relative  to  the 
z  axis.) 

In  fact,  we  have  dU  =  ( dU/dr)dr  +  ( dU/d(p)d<p  =  —  (F,  dr)  =  —  (F,  er)dr  — 
r(F,  %)d(p ;  therefore,  -dU/d<p  =  r(F,  e^)  =  r([er,  F],  ez)  =  ([r,  F],  ez). 

This  example  suggests  the  following  generalization  of  the  law  of  con¬ 
servation  of  angular  momentum. 

Definition.  A  coordinate  <j;  is  called  cyclic  if  it  does  not  enter  into  the 
lagrangian :  dL/dq(  =  0. 

Theorem.  The  generalized  momentum  corresponding  to  a  cyclic  coordinate  is 
conserved:  p(  =  const. 

Proof.  By  Lagrange’s  equation  dpjdt  =  dL/dqi  =  0.  □ 


14  Legendre  transformations 

The  Legendre  transformation  is  a  very  useful  mathematical  tool:  it  transforms  functions  on  a 
vector  space  to  functions  on  the  dual  space.  Legendre  transformations  are  related  to  projective 
duality  and  tangential  coordinates  in  algebraic  geometry  and  the  construction  of  dual  Banach 
spaces  in  analysis.  They  are  often  encountered  in  physics  (for  example,  in  the  definition  of 
thermodynamic  quantities). 


A  Definition 

Let  y  —  fix)  be  a  convex  function,  /"(x)  >  0. 

The  Legendre  transformation  of  the  function  /  is  a  new  function  g  of  a 
new  variable  p,  which  is  constructed  in  the  following  way  (Figure  43).  We 
draw  the  graph  of  /  in  the  x,  y  plane.  Let  p  be  a  given  number.  Consider  the 
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y 


x(p) 

Figure  43  Legendre  transformation 

straight  line  y  =  px.  We  take  the  point  x  =  x(p)  at  which  the  curve  is  farthest 
from  the  straight  line  in  the  vertical  direction:  for  each  p  the  function  px  — 
f(x)  =  F(p,  x)  has  a  maximum  with  respect  to  x  at  the  point  x(p).  Now  we 
define  gip )  =  F(p ,  x(p)). 

The  point  x(p)  is  defined  by  the  extremal  condition  dF/dx  =  0,  i.e., 
f'{x)  —  p.  Since  /  is  convex,  the  point  xip)  is  unique.28 

Problem  .  Show  that  the  domain  of  g  can  be  a  point,  a  closed  interval,  or  a  ray  if /is  defined 
on  the  whole  x  axis.  Prove  that  if  /  is  defined  on  a  closed  interval,  then  g  is  defined  on  the  whole  p 
axis. 

B  Examples 

Example  1.  Let  fix)  =  x2.  Then  Fip,  x)  =  px  —  x2,  x{p)  =  jp,  gip)  =  jp2. 
Example  2.  Let  fix)  =  mx2/2.  Then  gip)  —  p2/2m. 

Example  3.  Let  fix)  —  xajct.  Then  gip)  =  jf/fi,  where  (1/a)  +  (1//?)  =  1 
(a  >  1,  0  >  1). 


Figure  44  Legendre  transformation  taking  an  angle  to  a  line  segment 

Example  4.  Let  fix)  be  a  convex  polygon.  Then  gip)  is  also  a  convex  polygon, 
in  which  the  vertices  of / (x)  correspond  to  the  edges  of  gip),  and  the  edges  of 
fix)  to  the  vertices  of  gip).  For  example,  the  corner  depicted  in  Figure  44  is 
transformed  to  a  segment  under  the  Legendre  transformation. 

28  If  it  exists. 
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C  Involutivity 

Let  us  consider  a  function/  which  is  differentiable  as  many  times  as  necessary, 
with  f"{x)  >  0.  It  is  easy  to  verify  that  a  Legendre  transformation  takes 
convex  functions  to  convex  functions.  Therefore,  we  can  apply  it  twice. 

Theorem.  The  Legendre  transformation  is  involutive,  i.e.,  its  square  is  the 
identity:  if  under  the  Legendre  transformation  f  is  taken  to  g,  then  the 
Legendre  transform  of  g  will  again  be  f. 

Proof.  In  order  to  apply  the  Legendre  transform  to  g,  with  variable  p,  we 
must  by  definition  look  at  a  new  independent  variable  (which  we  will  call  x), 
construct  the  function 


G(x,  p)  =  xp  -  g(p), 

and  find  the  point  p(x)  at  which  G  attains  its  maximum:  dG/dp  =  0,  i.e., 
g'(p)  =  x.  Then  the  Legendre  transform  of  g(p )  will  be  the  function  of  x 
equal  to  G(x,  p(x)). 

We  will  show  that  G(x,  p(x))  =  f(x).  To  this  end  we  notice  that  G(x,  p)  = 
xp  —  g(p)  has  a  simple  geometric  interpretation :  it  is  the  ordinate  of  the 
point  with  abscissa  x  on  the  line  tangent  to  the  graph  of  / (x)  with  slope  p 


y 


Figure  45  Involutivity  of  the  Legendre  transformation 


(Figure  45).  For  fixed  p,  the  function  G(x,  p)  is  a  linear  function  of  x,  with 
dG/dx  =  p,  and  for  x  =  x(p)  we  have  G(x,  p)  =  xp  —  g(p )  =  / (x)  by  the 
definition  of  g(p). 

Let  us  now  fix  x  =  x0  and  vary  p.  Then  the  values  of  G(x,  p)  will  be  the 
ordinates  of  the  points  of  intersection  of  the  line  x  =  x0  with  the  line  tangent 
to  the  graph  of  /(x)  with  various  slopes  p.  By  the  convexity  of  the  graph  it 
follows  that  all  these  tangents  lie  below  the  curve,  and  therefore  the  maximum 
of  G(x,  p)  for  a  fixed  x(p0)  is  equal  to  / (x)  (and  is  achieved  for  p  =  p(x0)  = 
f(x  „)).  □ 
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Figure  46  Legendre  transformation  of  a  quadratic  form 

Corollary.29  Consider  a  given  family  of  straight  lines  y  =  px  —  g(p).  Then 
its  envelope  has  the  equation  y  =  f  (x),  where  f  is  the  Legendre  transform 

of  g. 

D  Young's  inequality 

Definition.  Two  functions,  /  and  g,  which  are  the  Legendre  transforms  of 
one  another  are  called  dual  in  the  sense  of  Young. 

By  definition  of  the  Legendre  transform,  F(x,  p)  =  px  -  f  (x)  is  less 
than  or  equal  to  g(p)  for  any  x  and  p.  From  this  we  have  Young's  inequality : 

px  <  f(x)  +  g(p). 

Example  1.  If  /(x)  =  jx2,  then  g(p)  =  \p2  and  we  obtain  the  well-known 
inequality  px  <  ^x2  +  \p2  for  all  x  and  p. 

Example  2.  If  /(x)  =  xa/ct,  then  g(p)  =  pp/fi,  where  (1/a)  +  (1//?)  =  1,  and 
we  obtain  Young's  inequality  px  <  (x2/a)  +  (pp/P)  for  all  x  >  0,  p  >  0, 
a  >  1,  j8  >  1,  and  (1/a)  +  (1//J)  =  1. 

E  The  case  of  many  variables 

Now  let  f(x)  be  a  convex  function  of  the  vector  variable  x  =  (xl5  . . . ,  x„) 
(i.e.,  the  quadratic  form  H82f  /dx2)dx,  dx)  is  positive  definite).  Then  the 
Legendre  transform  is  the  function  g(p)  of  the  vector  variable  p  —  (pu  ■  ■  •  ■>  PnX 
defined  as  above  by  the  equalities  g(p)  =  F(p,  x(p))  =  maxx  F(p,  x),  where 
Fi p,  x)  =  (p,  x)  -  fix)  and  p  =  df/dx. 

All  of  the  above  arguments,  including  Young’s  inequality,  can  be  carried 
over  without  change  to  this  case. 

Problem.  Let  / :  IR"  ->  IR  be  a  convex  function.  Let  F5"*  denote  the  dual  vector 
space.  Show  that  the  formulas  above  completely  define  the  mapping 
g  ;  [Rn*  ->  IR  (under  the  condition  that  the  linear  form  df  |x  ranges  over  all  of 
IR"*  when  x  ranges  over  IR"). 

29  One  can  easily  see  that  this  is  the  theory  of  “Clairaut’s  equation.” 
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Problem.  Let  /  be  the  quadratic  form  /(x)  =  £  fuXiXj ■  Show  that  its 
Legendre  transform  is  again  a  quadratic  form  g(p)  =  £  gijPiPj,  and  that  the 
values  of  both  forms  at  corresponding  points  coincide  (Figure  46): 

/(x(P»  =  0(P)  and  gf(p(x))  =  /(x). 


1 5  Hamilton’s  equations 

By  means  of  a  Legendre  transformation,  a  lagrangian  system  of  second-order  differential 
equations  is  converted  into  a  remarkably  symmetrical  system  of  2 n  first-order  equations  called 
a  hamiltonian  system  of  equations  (or  canonical  equations). 

A  Equivalence  of  Lagrange's  and  Hamilton's 
equations 

We  consider  the  system  of  Lagrange’s  equations  p  =  dL/dq,  where  p  = 
dL/d q,  with  a  given  lagrangian  function  L :  IR"  x  IR"  x  IR  -*■  IR,  which  we  will 
assume  to  be  convex30  with  respect  to  the  second  argument  q. 


Theorem.  The  system  of  Lagrange's  equations  is  equivalent  to  the  system  of 
2 n  first-order  equations  ( Hamilton's  equations ) 


P  =  - 


8H 

dq 


q  - 


dH 

5p’ 


where  H( p,  q,  t)  =  pq  —  L(q,  q,  t)  is  the  Legendre  transform  of  the  lagrang¬ 
ian  function  viewed  as  a  function  of  q. 

Proof.  By  definition,  the  Legendre  transform  of  L(q,  q,  t)  with  respect  to  q 
is  the  function  H( p)  =  pq  —  L(q),  in  which  q  is  expressed  in  terms  of  p 
by  the  formula  p  =  dL/dq,  and  which  depends  on  the  parameters  q  and  t. 
This  function  H  is  called  the  hamiltonian. 

The  total  differential  of  the  hamiltonian 

Jrr  dH  J  dH  J  dH  J 

dH  =  ~  dp  + —dq  +  ~dt 
dp  dq  dt 

is  equal  to  the  total  differential  of  pq  —  L  for  p  =  dL/dq : 

dL  dL 

dH  =  qdp-—dq-—dt. 

dq  dt 


Both  expressions  for  dH  must  be  the  same.  Therefore, 

dH  dH  _  _  dL  dH  _  dL 
dq  dq  dt  ~  dt' 

30  In  practice  this  convex  function  will  often  be  a  positive  definite  quadratic  form. 


q  dp 
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Applying  Lagrange’s  equations  p  =  dL/d q,  we  obtain  Hamilton’s  equa¬ 
tions. 

We  have  seen  that,  if  q(t)  satisfies  Lagrange’s  equations,  then  (p<t),  q(t)) 
satisfies  Hamilton’s  equations.  The  converse  is  proved  in  an  analogous 
manner.  Therefore,  the  systems  of  Lagrange  and  Hamilton  are  equivalent. 

□ 

Remark.  The  theorem  just  proved  applies  to  all  variational  problems,  not 
just  to  the  lagrangian  equations  of  mechanics. 

B  Hamilton’s  function  and  energy 

Example.  Suppose  now  that  the  equations  are  mechanical,  so  that  the 
lagrangian  has  the  usual  form  L  =  T  —  U,  where  the  kinetic  energy  T  is  a 
quadratic  form  with  respect  to  q: 

T  =  {  £  a^qj,  where  au  =  af  q,  t )  and  U  =  t/(q). 

Theorem.  Under  the  given  assumptions,  the  hamiltonian  H  is  the  total  energy 
H  =  T  +  U. 

The  proof  is  based  on  the  following  lemma  on  the  Legendre  transform  of 
a  quadratic  form. 

Lemma.  The  values  of  a  quadratic  form  /(x)  and  of  its  Legendre  transform 
g( p)  coincide  at  corresponding  points :  /(x)  =  g(p). 

Example.  For  the  form  f(x)  =  x2  this  is  a  well-known  property  of  a  tangent 
to  a  parabola.  For  the  form  f(x)  =  jmx2  we  have  p  =  mx  and  g(p)  = 
p2jlm  =  mx2/2  =  f  (x). 

Proof  of  the  lemma  By  Euler’s  theorem  on  homogeneous  functions 
(df/dx)x  =  2 f.  Therefore,  <y(p(x))  =  p x  —  /(x)  =  (df/dx)x  -  /  =  2/(x)  - 
f(x)  =  /(X).  □ 

Proof  of  the  theorem.  Reasoning  as  in  the  lemma,  we  find  that  H  =  pq- 
L  =  2T  -  (T  -  U)  =  T  +  U.  □ 

Example.  For  one-dimensional  motion 


dU 


In  this  case  T  =  i q 2,  U  =  U(q),  p  =  q,  H  =  \p2  +  U(q)  and  Hamilton’s 
equations  take  the  form 

q  =  p 
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15:  Hamilton’s  equations 


This  example  makes  it  easy  to  remember  which  of  Hamilton’s  equations 
has  a  minus  sign. 

Several  important  corollaries  follow  from  the  theorem  on  the  equivalence 
of  the  equations  of  motion  to  a  hamiltonian  system.  For  example,  the  law  of 
conservation  of  energy  takes  the  simple  form : 

Corollary  1.  dH/dt  =  dH/dt.  In  particular,  for  a  system  whose  hamiltonian 
function  does  not  depend  explicitly  on  time  (dH/dt  =  0),  the  law  of  conserva¬ 
tion  of  the  hamiltonian  function  holds:  H(p(t),  q(r))  =  const. 

Proof.  We  consider  the  variation  in  H  along  the  trajectory  H(p(t),  q(t),  t). 
Then,  by  Hamilton’s  equations, 

dff  _  off  /  dff\  off  off  ?]f_dH 
dt  dp  \  dq)  +  dq  dp  +  dt  dt  ' 

C  Cyclic  coordinates 

When  considering  central  fields,  we  noticed  that  a  problem  could  be  reduced 
to  a  one-dimensional  problem  by  the  introduction  of  polar  coordinates.  It 
turns  out  that,  given  any  symmetry  of  a  problem  allowing  us  to  choose  a 
system  of  coordinates  q  in  such  a  way  that  the  hamiltonian  function  is 
independent  of  some  of  the  coordinates,  we  can  find  some  first  integrals  and 
thereby  reduce  to  a  problem  in  a  smaller  number  of  coordinates. 

Definition.  If  a  coordinate  qY  does  not  enter  into  the  hamiltonian  function 
TT(Pi,  p2,---,pn',qi,---,q„‘,  0,  i-e.,  dH/dq1  =  0,  then  it  is  called  cyclic 
(the  term  comes  from  the  particular  case  of  the  angular  coordinate  in  a 
central  field). 

Clearly,  the  coordinate  qt  is  cyclic  if  and  only  if  it  does  not  enter  into  the 
lagrangian  function  (dL/dqt  =  0).  It  follows  from  the  hamiltonian  form  of 
the  equations  of  motion  that : 


Corollary  2.  Let  qx  he  a  cyclic  coordinate.  Then  p{  is  a  first  integral.  In  this 
case  the  variation  of  the  remaining  coordinates  with  time  is  the  same  as  in  a 
system  with  then  —  1  independent  coordinates  q2, . . . ,  qn and  with  hamilton¬ 
ian  function 

H(p 2 ,  •  •  •  i  Pn>  q2,  •  •  • ,  qn,  t,  c), 

depending  on  the  parameter  c  =  p2. 

Proof.  We  set  p'  =  (p2, ... ,  pn )  and  q'  =  (q2,...,  qn ).  Then  Hamilton’s 
equations  take  the  form 


d  ,  dH 
dtq  ~  dp' 

d  ,  _  _  dH 
Jt?  ~  ~  aq7 


d_  dH 

dt  dp2 
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The  last  equation  shows  that  pj^  =  const.  Therefore,  in  the  system  of  equations 
for  p  and  q\  the  value  of  px  enters  only  as  a  parameter  in  the  hamiltonian 
function.  After  this  system  of  2n  —  2  equations  is  solved,  the  equation  for  ^ 
takes  the  form 

qt  =  /(f).  where  /(f)  =  H(/>i,  p'(f),  q'(f),  f) 

at  op  i 

and  is  easily  integrated.  d 

Almost  all  the  solved  problems  in  mechanics  have  been  solved  by  means 
of  Corollary  2. 

Corollary  3.  Every  closed  system  with  two  degrees  of  freedom  {n  =  2)  which  has 
a  cyclic  coordinate  is  integrable. 

Proof.  In  this  case  the  system  for  p  and  <f  is  one-dimensional  and  is  im¬ 
mediately  integrated  by  means  of  the  integral  H(p\  q’)  =  c.  □ 

16  Liouville’s  theorem 

The  phase  flow  of  Hamilton’s  equations  preserves  phase  volume.  It  follows,  for  example,  that  a 
hamiltonian  system  cannot  be  asymptotically  stable. 

For  simplicity  we  look  at  the  case  in  which  the  hamiltonian  function  does 
not  depend  explicitly  on  the  time :  H  =  H{ p,  q). 

A  The  phase  flow 

Definition.  The  2n-dimensional  space  with  coordinates  Pi,  • .  • ,  p„;  9i,  •  ■  ■ , 
is  called  phase  space. 

Example.  In  the  case  n  =  1  this  is  the  phase  plane  of  the  system  x  =  —  d  U/dx, 
which  we  considered  in  Section  4. 

Just  as  in  this  simplest  example,  the  right-hand  sides  of  Hamilton’s 
equations  give  a  vector  field :  at  each  point  (p,  q)  of  phase  space  there  is  a 
2n-dimensional  vector  (-dH/dq,  dH/d p).  We  assume  that  every  solution  of 
Hamilton’s  equations  can  be  extended  to  the  whole  time  axis.31 

Definition.  The  phase  flow  is  the  one-parameter  group  of  transformations 
of  phase  space 

g'-  (p(0),  q(0))  i— » (p(f ),  q(f)X 

where  p(f)  and  q(f)  are  solutions  of  Hamilton’s  system  of  equations 
(Figure  47). 

Problem  .  Show  that  { g '}  is  a  group. 

31  For  this  it  is  sufficient,  for  example,  that  the  level  sets  of  H  be  compact. 
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Figure  47  Phase  flow 


B  Liouville's  theorem 

Theorem  1.  The  phase  flow  preserves  volume:  for  any  region  D  we  have  ( Figure 
48) 

volume  of  g*D  =  volume  of  D. 

We  will  prove  the  following  slightly  more  general  proposition  also 
due  to  Liouville. 


Figure  48  Conservation  of  volume 


Suppose  we  are  given  a  system  of  ordinary  differential  equations 
x  =  f(x),  x  =  (x1; . . . ,  x„),  whose  solution  may  be  extended  to  the  whole 
time  axis.  Let  {gf  be  the  corresponding  group  of  transformations : 

(1)  g%x)  =  x  +  f(x)t  +  O (t2),  ( t  -*  0). 

Let  D( 0)  be  a  region  in  x-space  and  t?(0)  its  volume; 

v(t)  =  volume  of  D(t )  D(t)  =  g{D(0). 

Theorem  2.  If  div  f  =  0,  then  g‘  preserves  volume:  v(t)  =  i?(0). 

C  Proof 

Lemma  1.  (dv/dt)  |f =0  =  JD(0)  div  f  dx  (dx  =  dx^  ■  ■  ■  dxn). 

Proof.  For  any  t,  the  formula  for  changing  variables  in  a  multiple  integral 
gives 


Calculating  dg'xjdx  by  formula  (1),  we  find 


8fx 

dx 


=  E  + 


df 

dx 


t  +  O  (f2) 


as  t  — ^  0. 
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We  will  now  use  a  well-known  algebraic  fact : 


Lemma  2.  For  any  matrix  A  =  (a^), 

det(£  +  At)  =  1  +  t  tr  A  +  0(f2),  t  -*•  0, 

where  tr  A  —  Yj=  i  au  is  the  trace  of  A  ( the  sum  of  the  diagonal  elements). 


(The  proof  of  Lemma  2  is  obtained  by  a  direct  expansion  of  the  deter¬ 
minant:  we  get  1  and  n  terms  in  r;  the  remaining  terms  involve  t2,  f3,  etc.) 
Using  this,  we  have 


det  ^  =  1  +  t  tr  +  0(r2). 
dx  ox 


But  tr  df/dx  =  Ya= i  dft/dxi  —  div  f.  Therefore, 


v(t)  = 


+  t  div  f  -I-  O (t2f\dx, 


which  proves  Lemma  1. 


□ 


Proof  of  theorem  2.  Since  t  —  t0  is  no  worse  than  t  =  0,  Lemma  1  can  be 
written  in  the  form 


dv(t) 

dt 


t  =  t0 


J 

JdOo ) 


div  f  dx. 


and  if  div  f  =  0,  dv/dt  =  0. 


□ 


In  particular,  for  Hamilton’s  equations  we  have 

r  d  (  dH\  d  (dH\  _ 

dlVf“ap(  aqj  +  dqUpj" 

This  proves  Liouville’s  theorem  (Theorem  1). 


□ 


Problem.  Prove  Liouville’s  formula  W  —  W0eStrAdt  for  the  Wronskian 
determinant  of  the  linear  system  x  =  ,4(r)x. 


Liouville’s  theorem  has  many  applications. 


Problem.  Show  that  in  a  hamiltonian  system  it  is  impossible  to  have 
asymptotically  stable  equilibrium  positions  and  asymptotically  stable  limit 
cycles  in  the  phase  space. 

Liouville’s  theorem  has  particularly  important  applications  in  statistical 
mechanics. 
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Liouville’s  theorem  allows  one  to  apply  methods  of  ergodic  theory 32  to 
the  study  of  mechanics.  We  consider  only  the  simplest  example : 

D  Poincare's  recurrence  theorem 

Let  g  be  a  volume-preserving  continuous  one-to-one  mapping  which  maps 
a  bounded  region  D  of  euclidean  space  onto  itself :  gD  =  D. 

Then  in  any  neighborhood  U  of  any  point  of  D  there  is  a  point  xeU 
which  returns  to  U,  i.e.,  gnx  e  U  for  some  n  >  0. 


U 


Figure  49  The  way  a  ball  will  move  in  an  asymmetrical  cup  is  unknown;  however 
Poincare’s  theorem  predicts  that  it  will  return  to  a  neighborhood  of  the  original  position. 

This  theorem  applies,  for  example,  to  the  phase  flow  g 1  of  a  two-dimen¬ 
sional  system  whose  potential  U(xu  x2 )  goes  to  infinity  as  (xl5  x2)  -*■  oo;  in 
this  case  the  invariant  bounded  region  in  phase  space  is  given  by  the  condition 
(Figure  49) 


D  =  {P,q:T+  U  <  E). 

Poincare’s  theorem  can  be  strengthened,  showing  that  almost  every 
moving  point  returns  repeatedly  to  the  vicinity  of  its  initial  position.  This  is 
one  of  the  few  general  conclusions  which  can  be  drawn  about  the  character 
of  motion.  The  details  of  motion  are  not  known  at  all,  even  in  the  case 

dU 

x  =  -  — ,  where  x  =  (xx,  x2). 

The  following  prediction  is  a  paradoxical  conclusion  from  the  theorems 
of  Poincare  and  Liouville:  if  you  open  a  partition  separating  a  chamber 
containing  gas  and  a  chamber  with  a  vacuum,  then  after  a  while  the  gas 
molecules  will  again  collect  in  the  first  chamber  (Figure  50). 

The  resolution  of  the  paradox  lies  in  the  fact  that  “a  while”  may  be  longer 
than  the  duration  of  the  solar  system’s  existence. 

Cf,  for  example,  the  book:  Halmos,  Lectures  on  Ergodic  Theory ,  1956  (Mathematical  Society 
of  Japan.  Publications.  No.  3). 
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Figure  50  Molecules  return  to  the  first  chamber. 


Figure  51  Theorem  on  returning 


Proof  of  Poincare’s  theorem.  We  consider  the  images  of  the  neighborhood 
U  (Figure  51): 

U,gU ,  g2U,...,gnU,... 

All  of  these  have  the  same  volume.  If  they  never  intersected,  D  would  have 
infinite  volume.  Therefore,  for  some  k  >  0  and  /  >  0,  with  k  >  l, 

gkU  n  glU  0. 

Therefore,  gk~lU  nU  *  0.  If  y  is  in  this  intersection,  then  y  =  gnx,  with 
x  e  U{n  =  k  —  /).  Then  xeU  and  gnx  6  U{n  =  k  —  /).  □ 

E  Applications  of  Poincare's  theorem 

Example  1.  Let  D  be  a  circle  and  g  rotation  through  an  angle  a.  If  a  = 
2n{m/n),  then  gn  is  the  identity,  and  the  theorem  is  obvious.  If  a  is  not  commen¬ 
surable  with  2n,  then  Poincare’s  theorem  gives 

V<5  >  0,  3m:  \gnx  —  x|  <  <5  (Figure  52). 


Figure  52  Dense  set  on  the  circle 
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It  easily  follows  that 

Theorem.  If  a  ^  2n{m/n),  then  the  set  of  points  gkx  is  dense 33  on  the  circle 

(*  =  1,2,...). 

Problem.  Show  that  every  orbit  of  motion  in  a  central  field  with  U  =  r4  is 
either  closed  or  densely  fills  the  ring  between  two  circles. 

Example  2.  Let  D  be  the  two-dimensional  torus  and  tpt  and  tp2  angular 
coordinates  on  it  (longitude  and  latitude)  (Figure  53). 


Consider  the  system  of  ordinary  differential  equations  on  the  torus 

<P  i  =  <*i  4>  2  =  a2- 

Clearly,  div  f  =  0  and  the  corresponding  motion 

9*'  (<Pi>  V2)  -»•  (<Pi  +  <*1 t,  (p2  +  «2f) 

preserves  the  volume  d<p2  •  From  Poincare’s  theorem  it  is  easy  to  deduce 

Theorem.  Iftx.1/<x2  is  irrational,  then  the  “  winding  line ”  on  the  torus,  gfcpi,  cp2), 
is  dense  in  the  torus. 

Problem.  Show  that  if  to  is  irrational,  then  the  Lissajous  figure  (x  =  cos  t, 
y  =  cos  cot)  is  dense  in  the  square  |x|  <  1,  |yj  <  1. 

Example  3.  Let  D  be  the  n-dimensional  torus  T”,  i.e.,  the  direct  product34 
of  n  circles : 

D  =  S1  x  Sl  x  •  •  •  x  S1  =  Tn. 

- - • 

n 

A  point  on  the  n-dimensional  torus  is  given  by  n  angular  coordinates 
<p  =  (<?!,...,  cp„).  Let  a  =  (alt . . . ,  a„),  and  let  g‘  be  the  volume-preserving 
transformation 

g(:  Tn  ->  Tn  q>  -►  cp  +  at. 

33  A  set  A  is  dense  in  B  if  there  is  a  point  of  A  in  every  neighborhood  of  every  point  of  B. 

34  The  direct  product  of  the  sets  A,  B _ is  the  set  of  points  (a,  b , . . .),  with  a  e  A.b  s  B _ 
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Problem.  Under  which  conditions  on  a  are  the  following  sets  dense  :  (a)  the 
trajectory  {g'<p};  (b)  the  trajectory  {gkty}  (t  belongs  to  the  group  of  real 
numbers  R,  k  to  the  group  of  integers  Z). 

The  transformations  in  Examples  1  to  3  are  closely  connected  to 
mechanics.  But  since  Poincare’s  theorem  is  abstract,  it  also  has  applications 
unconnected  with  mechanics. 


Example  4.  Consider  the  first  digits  of  the  numbers  2":  1,  2,  4,  8,  1,  3,  6,  1,  2, 
5,  1,2,  4,.... 

Problem.  Does  the  digit  7  appear  in  this  sequence?  Which  digit  appears 
more  often,  7  or  8?  How  many  times  more  often? 
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Lagrangian  mechanics  on  manifolds 


In  this  chapter  we  introduce  the  concepts  of  a  differentiable  manifold  and 
its  tangent  bundle.  A  lagrangian  function,  given  on  the  tangent  bundle, 
defines  a  lagrangian  “holonomic  system”  on  a  manifold.  Systems  of  point 
masses  with  holonomic  constraints  (e.g.,  a  pendulum  or  a  rigid  body)  are 
special  cases. 


17  Holonomic  constraints 

In  this  paragraph  we  define  the  notion  of  a  system  of  point  masses  with  holonomic  constraints. 

A  Example 

Let  y  be  a  smooth  curve  in  the  plane.  If  there  is  a  very  strong  force  field  in  a 
neighborhood  of  y,  directed  towards  the  curve,  then  a  moving  point  will 
always  be  close  to  y.  In  the  limit  case  of  an  infinite  force  field,  the  point  must 
remain  on  the  curve  y.  In  this  case  we  say  that  a  constraint  is  put  on  the 
system  (Figure  54). 

To  formulate  this  precisely,  we  introduce  curvilinear  coordinates  qx  and 
q2  on  a  neighborhood  of  y;  <j,  is  in  the  direction  of  y  and  q2  is  distance  from 
the  curve. 

We  consider  the  system  with  potential  energy 

UN  =  Nqj  +  U0(q1,q2), 

depending  on  the  parameter  N  (which  we  will  let  tend  to  infinity)  (Figure  55). 
We  consider  the  initial  conditions  on  y: 

4i(0)  =  <7?  qi(0)  =  q°t  q2(  0)  =  0  q2(  0)  =  0. 
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Figure  54  Constraint  as  an  infinitely  strong  field 


U 


Figure  55  Potential  energy  V N 


Denote  by  qx  =  <p(t,  N )  the  evolution  of  the  coordinate  q j  under  a  motion 
with  these  initial  conditions  in  the  field  UN. 

Theorem.  The  following  limit  exists,  as  N  ->  go  : 

lim  (p(t ,  N )  =  *l/(t). 

N-*  oo 

The  limit  qx  =  ij/(t)  satisfies  Lagrange's  equation 

ifdLA^dL* 
dt\dqj  dqf 

where  L*(q1,q1)  =  T\qi=4l  =  0  ~  U0\q2  =  o  (T  is  the  kinetic  energy  of 
motion  along  y). 

Thus,  as  JV  ->  oo,  Lagrange’s  equations  for  qx  and  q2  induce  Lagrange’s 
equation  for  qv  =  i j/(t). 

We  obtain  exactly  the  same  result  if  we  replace  the  plane  by  the  3 n- 
dimensional  configuration  space  of  n  points,  consisting  of  a  mechanical 
system  with  metric  ds2  =  £?=  t  m,  dr?  (the  m,  are  masses),  replace  the  curve  y 
by  a  submanifold  of  the  3n-dimensional  space,  replace  qx  by  some  coordinates 
qt  on  y,  and  replace  q2  by  some  coordinates  q2  in  the  directions  perpendicular 
to  y.  If  the  potential  energy  has  the  form 

U  =  U0{quq2)  +  Nq22, 

then  as  JV  -»  oo,  a  motion  on  y  is  defined  by  Lagrange’s  equations  with  the 
lagrangian  function 

L*  T  =  Q2  =  o  ^()lq2  =  0- 
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B  Definition  of  a  system  with  constraints 

We  will  not  prove  the  theorem  above,35  but  neither  will  we  use  it.  We  need 
it  only  to  justify  the  following. 


Definition.  Let  y  be  an  m-dimensional  surface  in  the  3n-dimensional  con¬ 
figuration  space  of  the  points  r1;  ...,r„  with  masses  Let 

*1  =  (<2i>  •  •  ■  >  4m)  be  some  coordinates  on  y:r,-  =  r,(q).  The  system 
described  by  the  equations 


d  dL  dL 
dt  dq  dq 


L  =  i  £  mfif  +  t/(q) 


is  called  a  system  of  n  points  with  3 n  —  m  ideal  holonomic  constraints. 
The  surface  y  is  called  the  configuration  space  of  the  system  with  constraints. 

If  the  surface  y  is  given  by  k  =  3n  —  m  functionally  independent 
equations  /i(r)  =  0, . . . ,  /*( r)  =  0,  then  we  say  that  the  system  is  con¬ 
strained  by  the  relations  fx  =  0, . . . ,  fk  ==  0. 


Holonomic  constraints  also  could  have  been  defined  as  the  limiting  case 
of  a  system  with  a  large  potential  energy.  The  meaning  of  these  constraints  in 
mechanics  lies  in  the  experimentally  determined  fact  that  many  mechanical 
systems  belong  to  this  class  more  or  less  exactly. 

From  now  on,  for  convenience,  we  will  call  ideal  holonomic  constraints 
simply  constraints.  Other  constraints  will  not  be  considered  in  this  book. 


1 8  Differentiable  manifolds 

The  configuration  space  of  a  system  with  constraints  is  a  differentiable  manifold.  In  this  para¬ 
graph  we  give  the  elementary  facts  about  differentiable  manifolds. 

A  Definition  of  a  differentiable  manifold 

A  set  M  is  given  the  structure  of  a  differentiable  manifold  if  M  is  provided 
with  a  finite  or  countable  collection  of  charts,  so  that  every  point  is  represented 
in  at  least  one  chart. 

A  chart  is  an  open  set  U  in  the  euclidean  coordinate  space  q  =  (qx, ... ,  qn ), 
together  with  a  one-to-one  mapping  of  U  onto  some  subset  of  M, 
(p:U  -*•  <pU  c  M. 

We  assume  that  if  points  p  and  p'  in  two  charts  U  and  U'  have  the  same 
image  in  M,  then  p  and  p'  have  neighborhoods  V  <=  U  and  V'  c  U'  with  the 
same  image  in  M  (Figure  56).  In  this  way  we  get  a  mapping  V  -*  V. 

This  is  a  mapping  of  the  region  V  of  the  euclidean  space  q  onto  the  region 
V'  of  the  euclidean  space  q',  and  it  is  given  by  n  functions  of  n  variables, 

35  The  proof  is  based  on  the  fact  that,  due  to  the  conservation  of  energy,  a  moving  point  cannot 
move  further  from  y  than  cN~ 1/2,  which  approaches  zero  as  N  -*  oo. 
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q'  =  q'(q),  (q  =  q(q')).  The  charts  U  and  U'  are  called  compatible  if  these 
functions  are  differentiable.36 

An  atlas  is  a  union  of  compatible  charts.  Two  atlases  are  equivalent  if 
their  union  is  also  an  atlas. 

A  differentiable  manifold  is  a  class  of  equivalent  atlases.  We  will  consider 
only  connected  manifolds.37  Then  the  number  n  will  be  the  same  for  all 
charts;  it  is  called  the  dimension  of  the  manifold. 

A  neighborhood  of  a  point  on  a  manifold  is  the  image  under  a  mapping 
(p:  U  ->  M  of  a  neighborhood  of  the  representation  of  this  point  in  a  chart  V. 
We  will  assume  that  every  two  different  points  have  non-intersecting 
neighborhoods. 

B  Examples 

Example  1.  Euclidean  space  U"  is  a  manifold,  with  an  atlas  consisting  of  one  chart. 

Example  2.  The  sphere  S 2  =  {(x,  y,  z):  x2  +  /  +  z2  =  1}  has  the  structure  of  a  manifold,  with 
atlas,  for  example,  consisting  of  two  charts  (U,-,  <A,  i  =  U  2)  in  stereographic  projection  (Figure 
57).  An  analogous  construction  applies  to  the  n-sphere 

S"  =  {(x,,  . . . ,  xn+  ,):  £  x2  =  1}. 


Example  3.  Consider  a  planar  pendulum.  Its  configuration  space-the  circle  S’ -is  a  manifold. 
The  usual  atlas  is  furnished  by  the  angular  coordinates  <p:  R1  =(-n,n),U2  =  (0,  2n) 

(Figure  58). 

Example  4.  The  configuration  space  of  the  “spherical’  mathematical  pendulum  is  the  two- 
dimensional  sphere  S 2  (Figure  58). 


36  By  differentiable  here  we  mean  r  times  continuously  differentiable;  the  exact  value  of  r 
(1  <  r  <  x)  is  immaterial  (we  may  take  r  =  x,  for  example). 

37  A  manifold  is  connected  if  it  cannot  be  divided  into  two  disjoint  open  subsets. 
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Example  5.  The  configuration  space  of  a  “planar  double  pendulum”  is  the  direct  product  of  two 
circles,  i.e.,  the  two-torus  T2  =  S'  x  S1  (Figure  58). 

Example  6.  The  configuration  space  of  a  spherical  double  pendulum  is  the  direct  product  of 
two  spheres,  S2  x  S2. 

Example  7.  A  rigid  line  segment  in  the  (qlt  c72)-plane  has  for  its  configuration  space  the  mani¬ 
fold  [R2  x  S1,  with  coordinates  qu  q2,q3  (Figure  59).  It  is  covered  by  two  charts. 


Figure  59  Configuration  space  of  a  segment  in  the  plane 


Example  8.  A  rigid  right  triangle  OAB  moves  around  the  vertex  0.  The  position  of  the  triangle 
is  given  by  three  numbers:  the  direction  OA  e  S2  is  given  by  two  numbers,  and  if  OA  is  given, 
one  can  rotate  OB  e  S1  around  the  axis  OA  (Figure  60). 

Connected  with  the  position  of  the  triangle  OAB  is  an  orthogonal  right-handed  frame, 
=OAj\OA  |,  e2  =  OB/\OB\,  e3  =  [elt  e2].  The  correspondence  is  one-to-one;  therefore  the 
position  of  the  triangle  is  given  by  an  orthogonal  three-by-three  matrix  with  determinant  1. 


Figure  60  Configuration  space  of  a  triangle 


The  set  of  all  three-by-three  matrices  is  the  nine-dimensional  space  R9,  Six  orthogonality 
conditions  select  out  two  three-dimensional  connected  manifolds  of  matrices  with  determinant 
+  1  and  —  1.  The  rotations  of  three-space  (determinant  +  1)  form  a  group,  which  we  call  S0(3). 
Therefore,  the  configuration  space  of  the  triangle  OAB  is  S0(3). 

Problem.  Show  that  S0(3)  is  homeomorphic  to  three-dimensional  real  projective  space. 
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Definition.  The  dimension  of  the  configuration  space  is  called  the  number  of 
degrees  of  freedom. 

Example  9.  Consider  a  system  of  k  rods  in  a  closed  chain  with  hinged  joints. 

Problem.  How  many  degrees  of  freedom  does  this  system  have? 

Example  1 0.  Embedded  manifolds.  We  say  that  M  is  an  embedded  ic -dimensional  sub-manifold  of 
euclidean  space  R"  (Figure  61)  if  in  a  neighborhood  U  of  every  point  x  e  M  there  are  n  -  k  func¬ 
tions  /j :  U  ->  U,f2:  V  ->  R, . . .  :  U  -*  IR  such  that  the  intersection  of  V  with  M  is  given  by 

the  equations  /,  =0,  =  0,  and  the  vectors  grad  /,,  . . . ,  grad /n_k  at  x  are  linearly 

independent. 


Figure  61  Embedded  submanifold 

It  is  easy  to  give  M  the  structure  of  a  manifold,  i.e.,  coordinates  in  a  neighborhood  of  x  (how?). 
It  can  be  shown  that  every  manifold  can  be  embedded  in  some  euclidean  space.  In  Example  8, 
50(3)  is  a  subset  of  R9. 

Problem.  Show  that  50(3)  is  embedded  in  R9,  and  at  the  same  time,  that  50(3)  is  a  manifold. 

C  Tangent  space 

If  M  is  a  fc -dimensional  manifold  embedded  in  then  at  every  point  x 
we  have  a  /c-dimensional  tangent  space  T Mx.  Namely,  T M%  is  the  orthogonal 
complement  to  {grad  /1?  ...,grad /„_*}  (Figure  62).  The  vectors  of  the 
tangent  space  T Mx  based  at  x  are  called  tangent  vectors  to  M  at  x.  We  can 
also  define  these  vectors  directly  as  velocity  vectors  of  curves  in  M : 

x  =  lim  where  <p(0)  =  x,  cp (t)  e  M. 

t->  o  f 


Figure  62  Tangent  space 
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The  definition  of  tangent  vectors  can  also  be  given  in  intrinsic  terms, 
independent  of  the  embedding  of  M  into  E”. 

We  will  call  two  curves  x  =  <p(f)  and  x  =  \|/(r)  equivalent  if  cp(0)  =  \|/(0)  —  x 
and  limt_0  (cp(t)  -  \J/(t))/£  =  0  in  some  chart.  Then  this  tangent  relationship 
is  true  in  any  chart  (prove  this !). 

Definition.  A  tangent  vector  to  a  manifold  M  at  the  point  x  is  an  equivalence 
class  of  curves  tp(t),  with  cp(0)  =  x. 

It  is  easy  to  define  the  operations  of  multiplication  of  a  tangent  vector 
by  a  number  and  addition  of  tangent  vectors.  The  set  of  tangent  vectors 
to  M  at  x  forms  a  vector  space  TMX.  This  space  is  also  called  the  tangent 
space  to  M  at  x. 

For  embedded  manifolds  the  definition  above  agrees  with  the  previous 
definition.  Its  advantage  lies  in  the  fact  that  it  also  holds  for  abstract 
manifolds,  not  embedded  anywhere. 

Definition.  Let  U  be  a  chart  of  an  atlas  for  M  with  coordinates  qu  . . . ,  q„. 
Then  the  components  of  the  tangent  vector  to  the  curve  q  =  tp(r)  are  the 
numbers  <^, . . . ,  where  &  =  (depi/dt) |,  =  0. 

D  The  tangent  bundle 

The  union  of  the  tangent  spaces  to  M  at  the  various  points,  (JxeM  TMX,  has 
a  natural  differentiable  manifold  structure,  the  dimension  of  which  is  twice 
the  dimension  of  M. 

This  manifold  is  called  the  tangent  bundle  of  M  and  is  denoted  by  TM.  A 
point  of  TM  is  a  vector  2;,  tangent  to  M  at  some  point  x.  Local  coordinates 
on  TM  are  constructed  as  follows.  Let  ql,...,q„  be  local  coordinates  on 
M,  and  components  of  a  tangent  vector  in  this  coordinate  system. 

Then  the  2n  numbers  (qu  . . . ,  q„,  . . . ,  £„)  give  a  local  coordinate  system 

on  T M.  One  sometimes  writes  dq(  for  ^ . 

The  mapping  p:T  M  -*■  M  which  takes  a  tangent  vector  2;  to  the  point 
x  e  M  at  which  the  vector  is  tangent  to  M  e  TMX),  is  called  the  natural 
projection.  The  inverse  image  of  a  point  xeM  under  the  natural  projection, 
p~  *(x),  is  the  tangent  space  TMX.  This  space  is  called  the  fiber  of  the  tangent 
bundle  over  the  point  x. 

E  Riemannian  manifolds 

If  M  is  a  manifold  embedded  in  euclidean  space,  then  the  metric  on  euclidean 
space  allows  us  to  measure  the  lengths  of  curves,  angles  between  vectors, 
volumes,  etc.  All  of  these  quantities  are  expressed  by  means  of  the  lengths  of 
tangent  vectors,  that  is,  by  the  positive  definite  quadratic  form  given  on 
every  tangent  space  TMX  (Figure  63): 

TMX-*M  §  -  $>. 
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Figure  63  Riemannian  metric 


For  example,  the  length  of  a  curve  on  a  manifold  is  expressed  using  this  form  as  /(-■)  = 
jj'  ^/(dx,  dx}.  or,  if  the  curve  is  given  parametrically,  y:  [f0,  r,]  -*  M,  f  ->  x(r)  e  M.  then 

Ky)  =  &  v/<S.  x>rff. 


Definition.  A  differentiable  manifold  with  a  fixed  positive  definite  quadratic 
form  on  every  tangent  space  TMX  is  called  a  Riemannian  manifold. 

The  quadratic  form  is  called  the  Riemannian  metric. 

Remark.  Let  U  be  a  chart  of  an  atlas  for  M  with  coordinates  . . . ,  qn. 
Then  a  Riemannian  metric  is  given  by  the  formula 

n 

ds2  =  X  aij(q)dqi  dqj  au  -  aJt, 

U=i 

where  dq(  are  the  coordinates  of  a  tangent  vector. 

The  functions  a^q)  are  assumed  to  be  differentiable  as  many  times  as 
necessary. 

F  The  derivative  map 

Let  /:  M  -*  N  be  a  mapping  of  a  manifold  M  to  a  manifold  N.  f  is  called 
differentiable  if  in  local  coordinates  on  M  and  N  it  is  given  by  differentiable 
functions. 


Definition.  The  derivative  of  a  differentiable  mapping  /:  M  ->  N  at  a  point 
x  e  M  is  the  linear  map  of  the  tangent  spaces 

/**:  TMx  -*•  TN /(x), 

which  is  given  in  the  following  way  (Figure  64): 

Let  v  g  TMX.  Consider  a  curve  q>:  IR  -*■  M  with  €p(0)  =  x,  and  velocity 
vector  (dq>/dt)\,  =  Q  =  \.  Then  ffx\  is  the  velocity  vector  of  the  curve 
/  o  (p ;  |R  -►  N , 


/*xV 


dt 


/(«K0). 
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M  N 


Figure  64  Derivative  of  a  mapping 


Problem.  Show  that  the  vector /,„v  does  not  depend  on  the  curve  cp,  but  only  on  the  vector  v. 


Problem.  Show  that  the  map/*,:  TM,  -*  TN ful  is  linear. 


Problem.  Let  x  =  (x,, . , . ,  xm)  be  coordinates  in  a  neighborhood  of  x  e  Af,  and  y  =  (y, _ _  y„) 

be  coordinates  in  a  neighborhood  of  y  e  IV.  Let  £  be  the  set  of  components  of  the  vector  v,  and 
t|  the  set  of  components  of  the  vector  /,„v.  Show  that 


v  cy, , 


Taking  the  union  of  the  mappings  /,x  for  all  x,  we  get  a  mapping  of  the  whole  tangent 
bundle 


/, :  T  M  -*  TN  /,v=/„v  for  v  e  T Mx. 

Problem.  Show  that/,  is  a  differentiable  map. 

Problem.  Let  /:  M  ->  N,  g  :  N  ->  K,  and  h  =  g  -  j :  M  -*  K.  Show  that  /i,  =  g,  /,. 

19  Lagrangian  dynamical  systems 

In  this  paragraph  we  define  lagrangian  dynamical  systems  on  manifolds.  Systems  with  holonomic 
constraints  are  a  particular  case. 

A  Definition  of  a  lagrangian  system 

Let  M  be  a  differentiable  manifold,  TM  its  tangent  bundle,  and  L:  TM  -*  IR 
a  differentiable  function.  A  map  y:  U  -*  M  is  called  a  motion  in  the  lagrangian 
system  with  configuration  manifold  M  and  lagrangian  function  L  if  y  is  an 
extremal  of  the  functional 


1 1 

L(y)dt, 

0 

where  y  is  the  velocity  vector  y(t)  e  TMy(t). 

Example.  Let  M  be  a  region  in  a  coordinate  space  with  coordinates  q  =  (gl,...,qn).  The 
lagrangian  function  L:  TM  -*■  M  may  be  written  in  the  form  of  a  function  L(q,  q)  of  the  2 n 
coordinates.  As  we  showed  in  Section  12,  the  evolution  of  coordinates  of  a  point  moving  with 
time  satisfies  Lagrange’s  equations. 
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Theorem.  The  evolution  of  the  local  coordinates  q  =  (qu  . . . ,  q„)  of  a  point  y(t) 
under  motion  in  a  lagrangian  system  on  a  manifold  satisfies  the  Lagrange 
equations 

d  dL  _dL 
dt  dq  dq  ’ 

where  L(q,  q)  is  the  expression  for  the function  L.TM  -»  IR  in  the  coordinates 
q  and  q  on  TM. 

We  often  encounter  the  following  special  case. 

B  Natural  systems 

Let  M  be  a  Riemannian  manifold.  The  quadratic  form  on  each  tangent  space, 

r  =  i<v,  v>  v  e  TMX, 

is  called  the  kinetic  energy.  A  differentiable  function  U:  M  -*■  IR  is  called  a 
potential  energy. 

Definition.  A  lagrangian  system  on  a  Riemannian  manifold  is  called  natural 
if  the  lagrangian  function  is  equal  to  the  difference  between  kinetic  and 
potential  energies :  L  =  T  —  U. 

Example.  Consider  two  mass  points  m1  and  m2  joined  by  a  line  segment  of  length  /  in  the 
(x,  y)-plane.  Then  a  configuration  space  of  three  dimensions 

M  =  M2  x  S1  <=  M2  x  R2 

is  defined  in  the  four-dimensional  configuration  space  K2  x  R2  of  two  free  points  (xl7  yd  and 
(x 2 1  y2)  by  the  condition  yjfixf  -  x2)2  +  (y,  —  y2)2  =  l  (Figure  65). 


y 


Figure  65  Segment  in  the  plane 


There  is  a  quadratic  form  on  the  tangent  space  to  the  four-dimensional  space  (xl5  x2 ,  yy,  y2): 

miixi  +  yi)  +  m2(x22  +  y22). 

Our  three-dimensional  manifold,  as  it  is  embedded  in  the  four-dimensional  one,  is  provided  with 
a  Riemannian  metric.  The  holonomic  system  thus  obtained  is  called  in  mechanics  a  line  segment 
of  fixed  length  in  the  (x,  y)-plane.  The  kinetic  energy  is  given  by  the  formula 
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C  Systems  with  holonomic  constraints 

In  Section  17  we  defined  the  notion  of  a  system  of  point  masses  with  holo¬ 
nomic  constraints.  We  will  now  show  that  such  a  system  is  natural. 

Consider  the  configuration  manifold  M  of  a  system  with  constraints  as 
embedded  in  the  3n-dimensional  configuration  space  of  a  system  of  free 
points.  The  metric  on  the  3n-dimensional  space  is  given  by  the  quadratic 
form  i  mi*f-  The  embedded  Riemannian  manifold  M  with  potential 
energy  U  coincides  with  the  system  defined  in  Section  17  or  with  the  limiting 
case  of  the  system  with  potential  U  +  Nql,  N  -*■  oo,  which  grows  rapidly 
outside  of  M. 

D  Procedure  for  solving  problems  with  constraints 

1.  Determine  the  configuration  manifold  and  introduce  coordinates 
q  j,...,  qk  (in  a  neighborhood  of  each  of  its  points). 

2.  Express  the  kinetic  energy  T  =  £  imfr f  as  a  quadratic  form  in  the 
generalized  velocities 

T  =  j  X  ai/n)4i4j- 

3.  Construct  the  lagrangian  function  L  —  T  —  U( q)  and  solve  Lagrange’s 
equations. 

Example.  We  consider  the  motion  of  a  point  mass  of  mass  1  on  a  surface  of  revolution  in  three- 
dimensional  space.  It  can  be  shown  that  the  orbits  are  geodesics  on  the  surface.  In  cylindrical 
coordinates  r,  (p ,  z  the  surface  is  given  (locally)  in  the  form  r  =  r(z)  or  z  =  z(r).  The  kinetic 
energy  has  the  form  (Figure  66) 

T  =  K-Y2  +  f  +  z2)  =  K(1  +  K2)z2  +  r2(z)<?2] 

in  coordinates  (p  and  z,  and 

T  =  i(x2  +  y2  +  z2)  =  i[(l  +  z'2)r2  +  r2cp2] 

in  coordinates  r  and  (p.  (We  have  used  the  identity  .x2  +  f 2  —  r2  +  r2<p2.) 

The  lagrangian  function  L  is  equal  to  T.  In  both  coordinate  systems  <p  is  a  cyclic  coordinate. 
The  corresponding  momentum  is  preserved;  pv  =  r2tp  is  nothing  other  than  the  z-component  of 


z 


Figure  66  Surface  of  revolution 
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angular  momentum.  Since  the  system  has  two  degrees  of  freedom,  knowing  the  cyclic  coordinate 
<p  is  sufficient  for  integrating  the  problem  completely  (cf.  Corollary  3,  Section  15). 

We  can  obtain  more  easily  a  clear  picture  of  the  orbits  by  reasoning  slightly  differently. 
Denote  by  a  the  angle  of  the  orbit  with  a  meridian.  We  have  r<j>  =  ( r|  sin  x  where  |r|  is  the  mag¬ 
nitude  of  the  velocity  vector  (Figure  66). 

By  the  law  of  conservation  of  energy,  H  =  L  =  T  is  preserved.  Therefore,  |r|  =  const,  so 
the  conservation  law  for  takes  the  form 

r  sin  a  =  const 

(“Clairaut's  theorem"). 

This  relationship  shows  that  the  motion  takes  place  in  the  region  |  sin  a  |  <  l,i.e.,r  >  r0  sin  a0. 
Furthermore,  the  inclination  of  the  orbit  from  the  meridian  increases  as  the  radius  r  decreases. 
When  the  radius  reaches  the  smallest  possible  value,  r  =  r0  sin  a0,  the  orbit  is  reflected  and 
returns  to  the  region  with  larger  r  (Figure  67). 


Figure  67  Geodesics  on  a  surface  of  revolution 

Problem.  Show  that  the  geodesics  on  a  convex  surface  of  revolution  are  divided  into  three 
classes :  meridians,  closed  curves,  and  geodesics  dense  in  a  ring  r>c. 

Problem.  Study  the  behavior  of  geodesics  on  the  surface  of  a  torus  ((r  -  R)2  +  z2  =  p2). 

E  N on-autonomous  systems 

A  lagrangian  non-autonomous  system  differs  from  the  autonomous  systems, 
which  we  have  been  studying  until  now,  by  the  additional  dependence  of  the 
lagrangian  function  on  time: 

L:  TM  x  R  -*  R  L  =  L{ q,  q,  f). 

In  particular,  both  the  kinetic  and  potential  energies  can  depend  on  time  in  a 
non-autonomous  natural  system: 

T.TM  x  R  -»  R  R  T  =  T(q,  q,  t)  U  =  U(q,  t). 

A  system  of  n  mass  points,  constrained  by  holonomic  constraints  depen¬ 
dent  on  time,  is  defined  with  the  help  of  a  time-dependent  submanifold  of  the 
configuration  space  of  a  free  system.  Such  a  manifold  is  given  by  a  mapping 

i  :  M  x  IR  -*■  E3"  /( q,  l)  =  x, 

which,  for  any  fixed  t  e  R,  defines  an  embedding  M  —*■  £3”.  The  formula  of 
section  D  remains  true  for  non-autonomous  systems. 
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Example.  Consider  the  motion  of  a  bead  along  a  vertical  circle  of  radius  r  (Figure  68)  which 
rotates  with  angular  velocity  u>  around  the  vertical  axis  passing  through  the  center  0  of  the 
circle.  The  manifold  M  is  the  circle.  Let  q  be  the  angular  coordinate  on  the  circle,  measured  from 
the  highest  point. 

Let  x,  y,  and  z  be  cartesian  coordinates  in  E3  with  origin  0  and  vertical  axis  z.  Let  <p  be  the 
angle  of  the  plane  of  the  circle  with  the  plane  ,xOz.  By  hypothesis,  =  cut.  The  mapping 
i:  M  x  R  -*■  E3  is  given  by  the  formula 


i(q,  t)  =  (r  sin  q  cos  cur,  r  sin  q  sin  tor,  r  cos  q). 

From  this  formula  (or,  more  simply,  from  an  “infinitesimal  right  triangle”)  we  find  that 

T  =  —  (uj2r2  sin2  q  +  r2q2)  U  =  mgr  cos  q. 

In  this  case  the  lagrangian  function  L  =  T  —  U  turns  out  to  be  independent  of  t,  although  the 
constraint  does  depend  on  time.  Furthermore,  the  lagrangian  function  turns  out  to  be  the  same 
as  in  the  one-dimensional  system  with  kinetic  energy 

T0  =  —  q2  M  =  mr2, 

and  with  potential  energy 

V  =  A  cos  q  ~  B  sin2  q ,  A  —  mgr ,  B  =  ™  vi2rz. 

The  form  of  the  phase  portrait  depends  on  the  ratio  between  A  and  B.  For  2 B  <  A  (i.e.,  for  a 
rotation  of  the  circle  slow  enough  that  a>2r  <  g),  the  lowest  position  of  the  bead  ( q  =  n)  is 


V 
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stable  and  the  characteristics  of  the  motion  are  generally  the  same  as  in  the  case  of  a  mathematical 
pendulum  (u>  =  0). 

For  2B  >  A ,  i.e.,  for  sufficiently  fast  rotation  of  the  circle,  the  lowest  position  of  the  bead 
becomes  unstable;  on  the  other  hand,  two  stable  positions  of  the  bead  appear  on  the  circle, 
where  cos  q  =  —  A/2B  =  —  g/ai2r.  The  behavior  of  the  bead  under  all  possible  initial  conditions 
is  clear  from  the  shape  of  the  phase  curves  in  the  ( q ,  <j)-plane  (Figure  69). 


20  E.  Noether’s  theorem 

Various  laws  of  conservation  (of  momentum,  angular  momentum,  etc.)  are  particular  cases  of 
one  general  theorem:  to  every  one-parameter  group  of  diffeomorphisms  of  the  configuration 
manifold  of  a  lagrangian  system  which  preserves  the  lagrangian  function,  there  corresponds  a 
first  integral  of  the  equations  of  motion. 

A  Formulation  of  the  theorem 

Let  M  be  a  smooth  manifold,  L :  TM  -*■  IR  a  smooth  function  on  its  tangent 
bundle  TM.  Let  h:  M  ->  M  be  a  smooth  map. 


Definition.  A  lagrangian  system  (M,  L)  admits  the  mapping  h  if  for  any  tangent 
vector  veTM, 


L(h*\)  =  L(v). 


Example.  Let  M  =  {(jcl5  x2,  x3)},  L  =  (m/ 2){x\  +  x2  +  x3)  —  U(x2,  x3).  The  system  admits 
the  translation  h:  (xl5  x2,  x3)  ->  (j^  +  s.  x2,  x3)  along  the  v,  axis  and  does  not  admit,  generally 
speaking,  translations  along  the  x2  axis. 


Noether’s  theorem.  If  the  system  (M,  L)  admits  the  one-parameter  group  of 
diffeomorphisms  hs:  M  — ►  M,  s  e  IR,  then  the  lagrangian  system  of  equations 
corresponding  to  L  has  a  first  integral  I  :  TM  — ►  R. 

In  local  coordinates  q  on  M  the  integral  I  is  written  in  the  form 


/( q,  q) 


dL  dhs( q) 
<3q  ds 


B  Proof 

First,  let  M  —  W  be  coordinate  space.  Let  q>:  IR  -*  M,  q  =  <p(t)  be  a  solution 
to  Lagrange’s  equations.  Since  h*  preserves  L,  the  translation  of  a  solution, 
hs  o  <p:  IR  M  also  satisfies  Lagrange’s  equations  for  any  s.38 

We  consider  the  mapping  <D:  IR  x  R  -*■  IR",  given  byq  =  0(s,  t)  =  hs(q>(t)) 
(Figure  70). 

We  will  denote  derivatives  with  respect  to  t  by  dots  and  with  respect  to  s 
by  primes.  By  hypothesis 


(1) 


0  = 


8L(fS>,  <1>) 

5s 


dL  *  dL  ^ 

dq  dq  ’ 


38  The  authors  of  several  textbooks  mistakenly  assert  that  the  converse  is  also  true,  i.e.,  that  if 
hs  takes  solutions  to  solutions,  then  h%  preserves  L. 
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Figure  70  Noether’s  theorem 


where  the  partial  derivatives  of  L  are  taken  at  the  point  q  =  <D(s,  t ),  q  = 

Ms,  t ). 

As  we  stated  above,  the  mapping  <I>|s=const:  IR  ->  R”  for  any  fixed  s 
satisfies  Lagrange’s  equation 

dt  |_dq  J  dq 

We  introduce  the  notation  F(s,  t)  =  (3L/3q)(<I>(s,  t),  Ms,  0)  and  substitute 
dF/dt  for  dL/dq  in  (1). 

Writing  q'  as  dq'/dt ,  we  get 


0  = 


'd  dL\  ,  BLld  , 

d,Wq  +a4(*'1 


d  (dL  , 
dt  (^q  q 


dl 

dt 


□ 


Remark.  The  first  integral  /  =  (dL/8 q)q'  is  defined  above  using  local 
coordinates  q.  It  turns  out  that  the  value  of  /(v)  does  not  depend  on  the  choice 
of  coordinate  system  q. 

In  fact,  /  is  the  rate  of  change  of  L(v)  when  the  vector  v  e  TMX  varies  inside 
TMX  with  velocity  (d/ds) \s=0hsx.  Therefore,  /(v)  is  well  defined  as  a  function 
of  the  tangent  vector  v  e  TMX .  Noether’s  theorem  is  proved  in  the  same  way 
when  M  is  a  manifold. 


C  Examples 

Example  1.  Consider  a  system  of  point  masses  with  masses  mp. 

x? 

L  =  Z  -  U(X)  Xi  =  Xnei  +  X'2e2  +  *i3e3> 


constrained  by  the  conditions  f/x)  =  0.  We  assume  that  the  system  admits 
translations  along  the  axis: 

hs  :  X,-  -+  x;  +  se,  for  all  i. 

In  other  words,  the  constraints  admit  motions  of  the  system  as  a  whole 
along  the  ej  axis,  and  the  potential  energy  does  not  change  under  these. 
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By  Noether’s  theorem  we  conclude:  If  a  system  admits  translations  along 
the  ej  axis,  then  the  projection  of  its  center  of  mass  on  the  axis  moves 
linearly  and  uniformly. 

In  fact,  (d/ds)  |s= o  hsXj  =  ej.  According  to  the  remark  at  the  end  of  B,  the 
quantity 


,  v  dL  v  . 

,  =  %e'  =  Sm'x“ 


is  preserved,  i.e.,  the  first  component  Pl  of  the  momentum  vector  is  pre¬ 
served.  We  showed  this  earlier  for  a  system  without  constraints. 


Example  2.  If  a  system  admits  rotations  around  the  ej  axis,  then  the  angular 
momentum  with  respect  to  this  axis, 

Mi  =  £  ([x,, 


is  conserved. 

It  is  easy  to  verify  that  if  hs  is  rotation  around  the  axis  by  the  angle  s, 
then  (d/ds)\s=0hsXi  =  [et,  xt],  from  which  it  follows  that 

J  ^ 

/  =  X  JT-  [ei,  X,]  =  £  (m,X;,  [e1(  x,])  =  £  ([x,,  et). 

I  i  i 


Problem  1.  Suppose  that  a  particle  moves  in  the  field  of  the  uniform  helical  line  jc  =  cosy, 
y  =  sin  <p,  z  =  cq>.  Find  the  law  of  conservation  corresponding  to  this  helical  symmetry. 

Answer.  In  any  system  which  admits  helical  motions  leaving  our  helical  line  fixed,  the  quantity 
/  =  cP3  +  M  3  is  conserved. 

Problem  2.  Suppose  that  a  rigid  body  is  moving  under  its  own  inertia.  Show  that  its  center  of 
mass  moves  linearly  and  uniformly.  If  the  center  of  mass  is  at  rest,  then  the  angular  momentum 
with  respect  to  it  is  conserved. 

Problem  3.  What  quantity  is  conserved  under  the  motion  of  a  heavy  rigid  body  if  it  is  fixed  at 
some  point  0?  What  if,  in  addition,  the  body  is  symmetric  with  respect  to  an  axis  passing 
through  0? 


Problem  4.  Extend  Noether’s  theorem  to  non-autonomous  lagrangian  systems. 

Hint.  Let  M1  =  M  x  IIS  be  the  extended  configuration  space  (the  direct  product  of  the 
configuration  manifold  M  with  the  time  axis  R). 

Define  a  function  L, :  TM^  -►  R  by 

dt 

L—  ; 
dr 

i.e.,  in  local  coordinates  q,  t  on  we  define  it  by  the  formula 


L, 


q. 


dq  dt\  (  dq/dr 
dx  '  dx)  dtjdx  ’ 


dt 

dx 


We  apply  Noether’s  theorem  to  the  lagrangian  system  (M ,,  Lx). 
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If  L,  admits  the  transformations  hs\  M,  -*  W,,  we  obtain  a  first  integral  /,:  TM,  ->  R. 
Since  f  Ldt  =  |'  L,  dx.  this  reduces  to  a  first  integral  /:  TM  x  R  ->  R  of  the  original  system. 
If.  in  local  coordinates  (q,  t)  on  M , ,  we  have  / ,  =  / ,  (q,  r,  dq/dt,  dt/dx),  then  /(q,  q,  r)  =  /  ,(q,  r,q,  1). 

In  particular,  if  L  does  not  depend  on  time,  Lj  admits  translations  along  time,  /is(q,  f)  = 
(q,  f  +  s).  The  corresponding  first  integral  1  is  the  energy  integral. 


21  D’Alembert’s  principle 

We  give  here  a  new  definition  of  a  system  of  point  masses  with  holonomic  constraints  and  prove 
its  equivalence  to  the  definition  given  in  Section  17. 


A  Example 

Consider  the  holonomic  system  (M,  L),  where  M  is  a  surface  in  three- 
dimensional  space {x}: 

L  =  fax2  -  U(x). 

In  mechanical  terms,  “  the  mass  point  x  of  mass  m  must  remain  on  the  smooth 
surface  M.” 

Consider  a  motion  of  the  point,  x(t).  If  Newton’s  equations  mx  +  (dU/dx) 
=  0  were  satisfied,  then  in  the  absence  of  external  forces  (U  =  0)  the  tra¬ 
jectory  would  be  a  straight  line  and  could  not  lie  on  the  surface  M. 

From  the  point  of  view  of  Newton,  this  indicates  the  presence  of  a  new 
force  “forcing  the  point  to  stay  on  the  surface.” 


Definition.  The  quantity 

„  ..  dU 

R  =  mx  +  — — 

ox 

is  called  the  constraint  force  (Figure  71). 


Figure  71  Constraint  force 

If  we  take  the  constraint  force  R(r)  into  account,  Newton’s  equations  are 
obviously  satisfied: 

SU  _ 
mx  =  — - — h  R. 

ox 

The  physical  meaning  of  the  constraint  force  becomes  clear  if  we  consider  our  system  with 
constraints  as  the  limit  of  systems  with  potential  energy  U  +  JVC/1asN-»'X,  where  C,(x)  = 
p2(\,  M).  For  large  N  the  constraint  potential  /Vt/,  produces  a  rapidly  changing  force 
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F  =  —  Nf5C/,/f5x;  when  we  pass  to  the  limit  (N  -►  x)  the  average  value  of  the  force  F  under 
oscillations  of  x  near  M  is  R.  The  force  F  is  perpendicular  to  M,  Therefore,  the  constraint 
force  R  is  perpendicular  to  M:( R,  4)  =  0  for  every  tangent  vector  4- 

B  Formulation  of  the  D'  Alembert-Lagrange 
principle 

In  mechanics,  tangent  vectors  to  the  configuration  manifold  are  called 
virtual  variations.  The  D’Alembert-Lagrange  principle  states: 

/  ..  dU  \ 

(mx  +  &'iV  =  0 

for  any  virtual  variation  or  stated  differently,  the  work  of  the  constraint  force 
on  any  virtual  variation  is  zero. 

For  a  system  of  points  x,  with  masses  the  constraint  forces  R,  are  defined 
by  Rj  =  m.  Xj  +  ( 8V/dx-x ),  and  D’Alembert’s  principle  has  the  form  £  (R,,  j^) 
=  0,  or  Y,  +  (dU/dXi),  £,)  =  0,  i.e.,  the  sum  of  the  works  of  the  con¬ 
straint  forces  on  any  virtual  variation  e  TMX  is  zero. 

Constraints  with  the  property  described  above  are  called  ideal. 

If  we  define  a  system  with  holonomic  constraints  as  a  limit  as  /V  -►  x ,  then  the  D’Alembert 
Lagrange  principle  becomes  a  theorem:  its  proof  is  sketched  above  for  the  simplest  case. 

It  is  possible,  however,  to  define  an  ideal  holonomic  constraint  using  the  D’Alembert- 
Lagrange  principle.  In  this  way  we  have  three  definitions  of  holonomic  systems  with  constraints : 

1.  The  limit  of  systems  with  potential  energies  U  +  NU  t  as  N  -*  x. 

2.  A  holonomic  system  (At,  L),  where  M  is  a  smooth  submanifold  of  the  configuration  space 
of  a  system  without  constraints  and  L  is  the  lagrangian. 

3.  A  system  which  complies  with  the  D’Alembert-Lagrange  principle. 

All  three  definitions  are  mathematically  equivalent. 

The  proof  of  the  implications  (1)  =>  (2)  and  (1)  =>  (3)  is  sketched  above  and  will  not  be  given 
in  further  detail.  We  will  now  show  that  (2)  <=>  (3). 

C  The  equivalence  of  the  D'  Alembert-Lagrange 
principle  and  the  variational  principle 

Let  M  be  a  submanifold  of  euclidean  space,  M  c=  [R*,  and  x:IR->Ma  curve, 
with  x(t„)  =  x0,  x(f  j )  =  x,. 

Definition.  The  curve  x  is  called  a  conditional  extremal  of  the  action  functional 


if  the  differential  <5<1)  is  equal  to  zero  under  the  condition  that  the  variation 
consists  of  nearby  curves39  joining  x0  to  x{  in  M. 

39  Strictly  speaking,  in  order  to  define  a  variation  ^<t>,  one  must  define  on  the  set  of  curves  near  x 
on  M  the  structure  of  a  region  in  a  vector  space.  This  can  be  done  using  coordinates  on  M ; 
however,  the  property  of  being  a  conditional  extremal  does  not  depend  on  the  choice  of  a  co¬ 
ordinate  system. 
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We  will  write 

(1)  =  0. 

Clearly,  Equation  (1)  is  equivalent  to  the  Lagrange  equations 


d  dL  _8L 
dt  dq  3q 


-  U(x) 


x  =  x(q), 


in  some  local  coordinate  system  q  on  M. 


Theorem.  A  curve  x:  IR  ->  M  c=  UN  is  a  conditional  extremal  of  the  action 
{i.e.,  satisfies  Equation  (1))  if  and  only  if  it  satisfies  D'Alembert's  equation 

(2)  (i+^M  =  0- 


Lemma.  Let  f:  {t:  t0  <  t  <  tfi  ^  UN  be  a  continuous  vector  field.  If  for  every 
continuous  tangent  vector  field  2;,  tangent  to  M  along  x  (i.e.,  ^(t)e  TMx(t), 
with  t )  =  Ofor  t  =  t0,  fi),  we  have 

ff(t)$(0*  =  0, 

•'to 

then  the  field  f(f)  is  perpendicular  to  M  at  every  point  x(t)  (i.e.,  (f  (t),  h)  =  0 
for  every  vector  h  g  TMx(t})  (Figure  72). 


Figure  72  Lemma  about  the  normal  field 


The  proof  of  the  lemma  repeats  the  argument  which  we  used  to  derive  the 
Euler-Lagrange  equations  in  Section  12. 

Proof  of  the  theorem.  We  compare  the  value  of  <J>  on  the  two  curves  x(r) 
and  x(r)  +  5(t),  where  £(f0)  =  5(0  =  0.  Integrating  by  parts,  we  obtain 
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It  is  obvious  from  this  formula40  that  Equation  (1),  5M<b  =  0,  is  equivalent 
to  the  collection  of  equations 


(3) 


dt  =  0. 


for  all  tangent  vector  fields  £( t)eTMx{t]  with  =  0.  By  the 

lemma  (where  we  must  set  f  =  x  +  (dl//dx))the  collection  of  equations  (3) 
is  equivalent  to  the  D’Alembert-Lagrange  equation  (2).  □ 


D  Remarks 

Remark  1.  We  derive  the  D'Alembert- Lagrange  principle  for  a  system  of  n 
points  x,-elR3,  /  =  1,  n,  with  masses  m(,  with  holonomic  constraints, 
from  the  above  theorem. 

In  the  coordinates  x  =  {x*  =  ,/m^x,},  the  kinetic  energy  takes  the  form 
T  =  i  £  mixf  =  \x2. 

By  the  theorem,  the  extremals  of  the  principle  of  least  action  satisfy  the 
condition 


=  0 


(the  D’Alembert-Lagrange  principle  for  points  in  [R3n:  the  3n-dimensional 
reaction  force  is  orthogonal  to  the  manifold  M  in  the  metric  T).  Returning 
to  the  coordinates  x;,  we  get 


i.e.,  the  D’Alembert-Lagrange  principle  in  the  form  indicated  earlier:  the 
sum  of  the  work  of  the  reaction  forces  on  virtual  variations  is  zero. 

Remark  2.  The  D’Alembert-Lagrange  principle  can  be  given  in  a  slightly 
different  form  if  we  turn  to  statics.  An  equilibrium  position  is  a  point  x0  which 
is  the  orbit  of  a  motion:  x(r)  =  x0. 

Suppose  that  a  point  mass  moves  along  a  smooth  surface  M  under  the 
influence  of  the  force  f  =  —dU/dx. 


Theorem.  The  point  x0  in  M  is  an  equilibrium  position  if  and  only  if  the  force 
is  orthogonal  to  the  surface  at  x0:  (f(x0),  £)  =  0 for  all^s  TMXo. 

This  follows  from  the  D’Alembert-Lagrange  equations  in  view  of  the 
fact  that  x  =  0. 


Definition.  —  mx  is  called  the  force  of  inertia. 

40  The  distance  of  the  points  x(f)  +  l;(r)  from  M  is  small  of  second-order  compared  with  c(r). 
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Now  the  D’Alembert-Lagrange  principle  takes  the  form : 

Theorem.  If  the  forces  of  inertia  are  added  to  the  acting  forces ,  x  becomes  an 
equilibrium  position. 

Proof.  D’Alembert’s  equation 

(  — mx  +  f,  £)  =  0 

expresses  the  fact,  as  in  the  preceding  theorem,  that  x  is  an  equilibrium 
position  of  a  system  with  forces  —  mx  +  f .  □ 

Entirely  analogous  statements  are  true  for  systems  of  points:  If  x  =  {x,} 
are  equilibrium  positions,  then  the  sum  of  the  work  of  the  forces  acting  on  the 
virtual  variations  is  equal  to  zero.  If  the  forces  of  inertia  —  m;x;(i)  are  added 
to  the  acting  forces,  then  the  position  x(r)  becomes  an  equilibrium  position. 

Now  a  problem  about  motions  can  be  reduced  to  a  problem  about 
equilibrium  under  actions  of  other  forces. 

Remark  3.  Up  to  now  we  have  not  considered  cases  when  the  constraints 
depend  on  time.  All  that  was  said  above  carries  over  to  such  constraints 
without  any  changes. 

Example.  Consider  a  bead  sliding  along  a  rod  which  is  tilted  at  an  angle  a 
to  the  vertical  axis  and  is  rotating  uniformly  with  angular  velocity  co  around 
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this  axis  (its  weight  is  negligible).  For  our  coordinate  q  we  take  the  distance 
from  the  point  0  (Figure  73).  The  kinetic  energy  and  lagrangian  are: 

L  =  T  =  jmv2  =  jmq2  +  jmco2r2, 

r  =  q  sin  a. 

Lagrange’s  equation :  mq  =  ma)2q  sin2  a. 

The  constraint  force  at  each  moment  is  orthogonal  to  virtual  variations 
(i.e.,  to  the  direction  of  the  rod),  but  is  not  at  all  orthogonal  to  the  actual 
trajectory. 

Remark  4.  It  is  easy  to  derive  conservation  laws  from  the  D’Alembert- 
Lagrange  equations.  For  example,  if  translation  along  the  axis  2;,  =  is 
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among  the  virtual  variations,  then  the  sum  of  the  work  of  the  constraint  forces 
on  this  variation  is  equal  to  zero: 

£(R„e1)  =  (XR„e,)  =  0. 

If  we  now  consider  constraint  forces  as  external  forces,  then  we  notice  that  the 
sum  of  the  first  components  of  the  external  forces  is  equal  to  zero.  This  means 
that  the  first  component,  Plf  of  the  momentum  vector  is  preserved. 

We  obtained  this  same  result  earlier  from  Noether’s  theorem. 

Remark  5.  We  emphasize  once  again  that  the  holonomic  character  of  some 
particular  physical  constraint  or  another  (to  a  given  degree  of  exactness)  is  a 
question  of  experiment.  From  the  mathematical  point  of  view,  the  holonomic 
character  of  a  constraint  is  a  postulate  of  physical  origin ;  it  can  be  introduced 
in  various  equivalent  forms,  for  example,  in  the  form  of  the  principle  of  least 
action  (1)  or  the  D’Alembert- Lagrange  principle  (2),  but,  when  defining 
the  constraints,  the  term  always  refers  to  experimental  facts  which  go  beyond 
Newton’s  equations. 

Remark  6.  Our  terminology  differs  somewhat  from  that  used  in  mechanics 
textbooks,  where  the  D’Alembert- Lagrange  principle  is  extended  to  a  wider 
class  of  systems  (“non-holonomic  systems  with  ideal  constraints”).  In  this 
book  we  will  not  consider  non-holonomic  systems.  We  remark  only  that  one 
example  of  a  non-holonomic  system  is  a  sphere  rolling  on  a  plane  without 
slipping.  In  the  tangent  space  at  each  point  of  the  configuration  manifold  of  a 
non-holonomic  system  there  is  a  fixed  subspace  to  which  the  velocity  vector 
must  belong. 

Remark  7.  If  a  system  consists  of  mass  points  connected  by  rods,  hinges, 
etc.,  then  the  need  may  arise  to  talk  about  the  constraint  force  of  some  partic¬ 
ular  constraint. 

We  defined  the  total  “constraint  force  of  all  constraints”  R*  for  every  mass 
point  mt.  The  concept  of  a  constraint  force  for  an  individual  constraint  is 
impossible  to  define,  as  may  be  already  seen  from  the  simple  example  of  a  beam 
resting  on  three  columns.  If  we  try  to  define  constraint  forces  of  the  columns, 
Rl5  R2,  R3  by  passing  to  a  limit  (considering  the  columns  as  very  rigid 
springs),  then  we  may  become  convinced  that  the  result  depends  on  the 
distribution  of  rigidity. 


/?? 
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Figure  74  Constraint  force  on  a  rod 


21:  D’Alembert’s  principle 


Problems  for  students  are  selected  so  that  this  difficulty  does  not  arise. 

Problem.  A  rod  of  weight  P,  tilted  at  an  angle  of  60°  to  the  plane  of  a  table,  begins  to  fall 
with  initial  velocity  zero  (Figure  74).  Find  the  constraint  force  of  the  table  at  the  initial  moment, 
considering  the  table  as  (a)  absolutely  smooth  and  (b)  absolutely  rough.  (In  the  first  case,  the 
holonomic  constraint  holds  the  end  of  the  rod  on  the  plane  of  the  table,  and  in  the  second  case, 
at  a  given  point.) 
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Because  linear  equations  are  easy  to  solve  and  study,  the  theory  of  linear 
oscillations  is  the  most  highly  developed  area  of  mechanics.  In  many  non¬ 
linear  problems,  linearization  produces  a  satisfactory  approximate  solution. 
Even  when  this  is  not  the  case,  the  study  of  the  linear  part  of  a  problem  is 
often  a  first  step,  to  be  followed  by  the  study  of  the  relation  between  motions 
in  a  nonlinear  system  and  in  its  linear  model. 


22  Linearization 

We  give  here  the  definition  of  small  oscillations. 

A  Equilibrium  positions 

Definition.  A  point  x0  is  called  an  equilibrium  position  of  the  system 

(1)  f  =  f(x),  xeRn 

if  x(r)  =  x0  is  a  solution  of  this  system.  In  other  words,  f(x0)  =  0,  i.e., 
the  vector  field  f(x)  is  zero  at  x0 . 


Example.  Consider  the  natural  dynamical  system  with  lagrangian  function 
L(q,  q)  —  T  —  U,  where  T  =  a^q^qj  >  0  and  u  = 


(2) 


d  dL  _dL 
dt  8q  dq  ’ 


q  iq i  q n )• 


Lagrange’s  equations  can  be  written  in  the  form  of  a  system  of  In  first- 
order  equations  of  form  (1).  We  will  try  to  find  an  equilibrium  position: 


98 


22:  Linearization 


Theorem.  The  point  q  =  q0,  q  =  q0  will  be  an  equilibrium  position  if  and  only 
ifq0  =  0  and  q0  is  a  critical  point  of  the  potential  energy ,  i.e.. 


(3) 


=  0. 


Proof.  We  write  down  Lagrange’s  equations 


d  dT  _dT  _8U 
dt  dq  dq  dq  ’ 


From  (2)  it  is  clear  that,  for  q  =  0,  we  will  have  dT/d q  =  0  and  dT/dq  =  0. 
Therefore,  q  =  q0  is  a  solution  in  case  (3)  holds  and  only  in  that  case.  □ 


B  Stability  of  equilibrium  positions 

We  will  now  investigate  motions  with  initial  conditions  close  to  an  equi¬ 
librium  position. 


Theorem.  If  the  point  q0  is  a  strict  local  minimum  of  the  potential  energy  U, 
then  the  equilibrium  q  =  q0  is  stable  in  the  sense  of  Liapunov. 

Proof.  Let  L/(q0)  =  h.  For  sufficiently  small  e  >  0,  the  connected  com¬ 
ponent  of  the  set  {q:  £/(q)  <  h  +  e}  containing  q0  will  be  an  arbitrarily 
small  neighborhood  of  q0  (Figure  75).  Furthermore,  the  connected  com¬ 
ponent  of  the  corresponding  region  in  phase  space  p,  q,  {p,  q:£(p,  q)  < 
h  +  e},  (where  p  =  dT/dq  is  the  momentum  and  E  =  T  +  U  is  the  total 
energy)  will  be  an  arbitrarily  small  neighborhood  of  the  point  p  =  0,  q  =  q0 . 

But  the  region  {p,  q :  E  <  h  +  e}  is  invariant  with  respect  to  the  phase 
flow  by  the  law  of  conservation  of  energy.  Therefore,  for  initial  conditions 
p(0),  q(0)  close  enough  to  (0,  q0),  every  phase  trajectory  (p (r),  q(t))  is  close  to 

(0,  q0).  □ 


V 


P 


£"</;+  e 


Q 


Figure  75  Stable  equilibrium  position 
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Problem.  Can  an  equilibrium  position  q  =  q0,  p  =  0  be  asymptotically  stable? 

Problem.  Show  that  in  an  analytic  system  with  one  degree  of  freedom  an  equilibrium  position 
qn  which  is  not  a  strict  local  minimum  of  the  potential  energy  is  not  stable  in  the  sense  of 
Liapunov.  Produce  an  example  of  an  infinitely  differentiable  system  where  this  is  not  true. 


Remark.  It  seems  likely  that  in  an  analytic  system  with  n  degrees  of 
freedom,  an  equilibrium  position  which  is  not  a  minimum  point  is  unstable; 
but  this  has  never  been  proved  for  n  >  2. 


C  Linearization  of  a  differential  equation 

We  now  turn  to  the  general  system  (1).  In  studying  solutions  of  (1)  which  are 
close  to  an  equilibrium  position  x0,  we  often  use  a  linearization.  Assume  that 
x0  =  0  (the  general  case  is  reduced  to  this  one  by  a  translation  of  the  co¬ 
ordinate  system).  Then  the  first  term  of  the  Taylor  series  for  f  is  linear: 


f(x)  =  Ax  +  R2(x), 


and  R2  =  0(x2), 
o 


where  the  linear  operator  A  is  given  in  coordinates  xl5 


a 


i. r 


au  = 


. . ,  x„  by  the  matrix 


Definition.  The  passage  from  system  (1)  to  the  system 

(4)  dr  =  Ay  (x  e  IR",  y  e  TRn0) 

at 

is  called  the  linearization  of  (1). 


Problem.  Show  that  linearization  is  a  well-defined  operation :  the  operator 
A  does  not  depend  on  the  coordinate  system. 

The  advantage  of  the  linearized  system  is  that  it  is  linear  and  therefore 
easily  solved: 

A2t2 

y(t)  =  e^ytO),  where  eAt  =  E  +  At  +  — -  +  •  •  • . 

Knowing  the  solution  of  the  linearized  system  (4),  we  can  say  something 
about  solutions  of  the  original  system  (1).  For  small  enough  x,  the  difference 
between  the  linearized  and  original  systems,  R2(x),  is  small  in  comparison 
with  x.  Therefore,  for  a  long  time,  the  solutions  y(t),  x(r)  of  both  systems 
with  initial  conditions  y(0)  =  x(0)  =  x0  remain  close.  More  explicitly,  we 
can  easily  prove  the  following: 

Theorem.  For  any  T  >  0  and  for  any  e  >  0  there  is  a  3  >  0  such  that  if 
|  x(0)  |  <  3,  then  \  x(t)  -  y(t )  |  <  eS  for  all  t  in  the  interval  0  <  t  <  T. 
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D  Linearization  of  a  lagrangian  system 

We  return  again  to  the  lagrangian  system  (2)  and  try  to  linearize  it  in  a 
neighborhood  of  the  equilibrium  position  q  =  q0.  In  order  to  simplify  the 
formulas,  we  choose  a  coordinate  system  so  that  q0  =  0. 


Theorem.  In  order  to  linearize  the  lagrangian  system  (2)  in  a  neighborhood  of 
the  equilibrium  position  q  =  0,  it  is  sufficient  to  replace  the  kinetic  energy 
T  =  by  its  value  at  q  =  0, 

^2  2  ^  H j i  atJ  at-j(0), 


and  replace  the  potential  energy  £/(q)  by  its  quadratic  part 

d2U 


U  2  =  jZbu^P 


blJ  dqfqj 


9  =  0 


Proof.  We  reduce  the  lagrangian  system  to  the  form  ( 1 )  by  using  the  canonical 
variables  p  and  q: 


»=-*■  H^  =  T+U- 
Since  p  =  q  =  0  is  an  equilibrium  position,  the  expansions  of  the  right-hand 
sides  in  Taylor  series  at  zero  begin  with  terms  that  are  linear  in  p  and  q. 
Since  the  right-hand  sides  are  partial  derivatives,  these  linear  terms  are 
determined  by  the  quadratic  terms  H2  of  the  expansion  for  H( p,  q).  But 
H2  is  precisely  the  hamiltonian  function  of  the  system  with  lagrangian 
L2  =  T2  —  U2 ,  since,  clearly,  H2  =  T2( p)  +  f/2(q).  Therefore,  the  linearized 
equations  of  motion  are  the  equations  of  motion  for  the  system  described 
in  the  theorem  with  L2  =  T2  —  U2.  □ 


Example.  We  consider  the  system  with  one  degree  of  freedom: 

T  =  ja(q)q2,  U  =  U(q). 

Let  q  -  q0  be  a  stable  equilibrium  position:  (d  U/dq)  \q  =  qo  =  0,(d2U/8q2)\q=qo 
>  0  (Figure  76). 
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As  we  know  from  the  phase  portrait,  for  initial  conditions  close  to  q  =  q0, 
p  =  0,  the  solution  is  periodic  with  period  z  depending,  generally  speaking, 
on  the  initial  conditions.  The  above  two  theorems  imply 

Corollary.  The  period  z  of  oscillations  close  to  the  equilibrium  position  q0 
approaches  the  limit  z0  =  2n/a)0,  (where  co %  =  b/a,  b  =  (d2U/dq2) \q=qo, 
and  a  =  a(q0 ))  as  the  amplitudes  of  the  oscillations  decrease. 

Proof.  For  the  linearized  system,  T2  =  jaq2  and  U2  =  \bq2  (taking  q0  =  0). 
The  solutions  to  Lagrange’s  equation  q  =  —  a>lq  have  period  t0  =  2n/a)0: 

q  =  C|  cos  w0t  +  c2  sin  co0t 

for  any  initial  amplitude.  D 

E  Small  oscillations 

Definition.  Motions  in  a  linearized  system  (L2  =  T2  —  U2)  are  called  small 
oscillations41  near  an  equilibrium  q  =  q0.  In  a  one-dimensional  problem 
the  numbers  t0  and  co0  are  called  the  period  and  the  frequency  of  small 
oscillations. 

Problem.  Find  the  period  of  small  oscillations  of  a  bead  of  mass  1  on  a  wire  y  =  U(x)  in  a 
gravitational  field  with  g  =  1,  near  an  equilibrium  position  x  =  x0  (Figure  77). 


Solution.  We  have 


U  =  mgy  =  U(x) 


T=}mv2 


1  + 


'dU 

dx 


Let  x0  be  a  stable  equilibrium  position:  (dU/dx)\X0  =  0;  (d2U/dx2)\X0  >  0.  Then  the  frequency 
of  small  oscillations,  a>,  is  defined  by  the  formula 


since,  for  the  linearized  system,  T2 


d2U 
dx2  )\ 


\q2  and  U2  =  p*?  (q  =  x  -  x0). 


41  If  the  equilibrium  position  is  unstable,  we  will  talk  about  “unstable  small  oscillations 
even  though  these  motions  may  not  have  an  oscillatory  character. 
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Problem.  Show  that  not  only  a  small  oscillation,  but  any  motion  of  the  bead  is  equivalent  to  a 
motion  in  some  one-dimensional  system  with  lagrangian  function  L  —  %q2  —  V{q). 

Hint,  Take  length  along  the  wire  for  q. 


23  Small  oscillations 

We  show  here  that  a  lagrangian  system  undergoing  small  oscillations  decomposes  into  a  direct 
product  of  systems  with  one  degree  of  freedom. 

A  A  problem  about  pairs  of  forms 

We  will  consider  in  more  detail  the  problem  of  small  oscillations.  In  other 
words,  we  consider  a  system  whose  kinetic  and  potential  energies  are 
quadratic  forms 

(1)  T=±(Aq,  q)  U  =  \{Bq,  q)  q  e  R",  q  e  R". 

The  kinetic  energy  is  a  positive  definite  form. 

In  order  to  integrate  Lagrange’s  equations,  we  will  make  a  special  choice 
of  coordinates. 

As  we  know  from  linear  algebra,  a  pair  of  quadratic  forms  (Aq,  q),  (Z?q,  q), 
the  first  of  which  is  positive  definite,  can  be  reduced  to  principal  axes  by  a 
linear  change  of  coordinates  :42 

Q  =  cq  Q  =  (Qi, Q„). 

In  addition,  the  coordinates  Q  can  be  chosen  so  that  the  form  {Aq,  q)  de¬ 
composes  into  the  sum  of  squares  (Q,  Q).  Let  Q  be  such  coordinates;  then,  I 

since  Q  =  Cq,  we  have  1 

(2)  T  =  l3t<2<  U  = 

z  t'=  1  L  i=  1 

The  numbers  are  called  the  eigenvalues  of  the  form  B  with  respect  to  A. 

Problem.  Show  that  the  eigenvalues  of  B  with  respect  to  A  satisfy  the  char¬ 
acteristic  equation 

(3)  det|B  —  XA\  =  0, 

all  the  roots  of  which  are,  therefore,  real  (the  matrices  A  and  B  are  symmetric 
and  A  >  0). 

B  Characteristic  oscillations 

In  the  coordinates  Q  the  lagrangian  system  decomposes  into  n  independent 
equations 

w  a = -A,a.  ! 

42  If  one  wants  to,  one  can  introduce  a  euclidean  structure  by  taking  the  first  form  as  the  scalar  I 

product,  and  then  reducing  the  second  form  to  the  principal  axes  by  a  transformation  which  is  \ 

orthogonal  with  respect  to  this  euclidean  structure. 
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Therefore  we  have  proved: 

Theorem.  A  system  performing  small  oscillations  is  the  direct  product  of  n  one¬ 
dimensional  systems  performing  small  oscillations. 

For  the  one-dimensional  systems,  there  are  three  possible  cases : 

Case  1:  X  =  co2  >  0;  the  solution  is  Q  =  Q  cos  cot  +  C2  sin  cot  (oscillation) 
Case  2:  X  =  0;  the  solution  is  Q  =  Cx  +  C2t  (neutral  equilibrium) 

Case  3:  X  =  —  k2  <  0;  the  solution  is  Q  =  Cx  cosh  kt  +  C2  sinh  kt 
(instability) 

Corollary.  Suppose  one  of  the  eigenvalues  of{  3)  is  positive:  X  =  co2  >  0.  Then 
system  (1)  can  perform  a  small  oscillation  of  the  form 

(5)  q(0  =  (Cj  cos  cot  +  C2  sin  cot)$, 

where  £  is  an  eigenvector  corresponding  to  X  ( Figure  78): 

=  XAl 


Figure  78  Characteristic  oscillation 


This  oscillation  is  the  product  of  the  one-dimensional  motion  Qt  — 
Cj  cos  t pit  +  C2  sin  a>,r  and  the  trivial  motion  Qj  =  0  O'  ^  0- 

Definition.  The  periodic  motion  (5)  is  called  a  characteristic  oscillation  of 
system  (1),  and  the  number  co  is  called  the  characteristic  frequency. 

Remark.  Characteristic  oscillations  are  also  called  principal  oscillations 
or  normal  modes.  A  nonpositive  X  also  has  eigenvectors;  we  will  also  call  the 
corresponding  motions  “characteristic  oscillations,”  although  they  are  not 
periodic;  the  corresponding  “characteristic  frequencies”  are  imaginary. 


Problem.  Show  that  the  number  of  independent  real  characteristic  oscil¬ 
lations  is  equal  to  the  dimension  of  the  largest  positive  definite  subspace  for 
the  potential  energy  q)- 
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Now  the  result  may  be  formulated  as  follows: 

Theorem.  The  system  (1)  has  n  characteristic  oscillations ,  the  directions  of 
which  are  pairwise  orthogonal  with  respect  to  the  scalar  product  given  by 
the  kinetic  energy  A. 

Proof.  The  coordinate  system  Q  is  orthogonal  with  respect  to  the  scalar 
product  ( Aq ,  q)  by  (2).  □ 

C  Decomposition  into  characteristic  oscillations 
It  follows  from  the  above  theorem  that : 


Corollary.  Every  small  oscillation  is  a  sum  of  characteristic  oscillations. 


A  sum  of  characteristic  oscillations  is  generally  not  periodic  (remember 
the  Lissajous  figures!). 

To  decompose  a  motion  into  a  sum  of  characteristic  oscillations,  it  is 
sufficient  to  project  the  initial  conditions  q,  q  onto  the  characteristic  direc¬ 
tions  and  solve  the  corresponding  one-dimensional  problems  (4). 

Therefore,  the  Lagrange  equations  for  system  (1)  can  be  solved  in  the 
following  way.  We  first  look  for  characteristic  oscillations  of  the  form 
q  =  e,V)%.  Substituting  these  into  Lagrange’s  equations 


d  A. 

^q=  -Bq, 


we  find 


(B  -  a>2A)$  =  0. 

From  the  characteristic  equation  (3)  we  find  n  eigenvalues  Ak  =  a)?.  To  these 
there  correspond  n  pairwise  orthogonal  eigenvectors  t)k.  A  general  solution 
in  the  case  A  ^  0  has  the  form 


q(l)  =  Ref  Cke*°'%t. 

k=  1 

Remark.  This  result  is  also  true  when  some  of  the  A  are  multiple  eigen¬ 
values. 

Thus,  in  a  lagrangian  system,  as  opposed  to  a  general  system  of  linear 
differential  equations,  resonance  terms  of  the  form  t  sin  an,  etc.  do  not  arise, 
even  in  the  case  of  multiple  eigenvalues. 

D  Examples 

Example  1.  Consider  the  system  of  two  identical  mathematical  pendulums  of  length  /,  =  i2  =  1 
and  mass  m,  =  m2  =  1  in  a  gravitational  field  with  g  =  1.  Suppose  that  the  pendulums  are 
connected  by  a  weightless  spring  whose  length  is  equal  to  the  distance  between  the  points  of 
suspension  (Figure  79).  Denote  by  q j  and  q2  the  angles  of  inclination  of  the  pendulums.  Then 
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Figure  79  Identical  connected  pendulums 


for  small  oscillations,  T  =  {{q\  +  q22)  and  U  =  $(q]  +  q\  +  <x{ql  -  q2)2),  where  \<x (qt  -  q2)2 
is  the  potential  energy  of  the  elasticity  of  the  spring.  Set 


Q  i 


<7.  +  </2 


and  Q 2  = 


<h  ~  Q  2 


Then 


<h  = 


Qi  +  Qi 


and  q2  = 


Q1-Q2 


and  both  forms  are  reduced  to  principal  axes: 


T  =  j({)f  +  (>2)  U  =  i  (toiQi  +■  w\Ql\ 


where  w,  =  1  and  u)2  =  +  2a.  (Figure  80).  So  the  two  characteristic  oscillations  are  as 

follows  (Figure  81): 

1.  Q1  =  0,  i.e.,  £/,  =  q2,  both  pendulums  move  in  phase  with  the  original  frequency  1,  and  the 
spring  has  no  effect; 

2.  Qy  =  0,  i.e.,  q,  =  —q2-  the  pendulums  move  in  opposite  phase  with  increased  frequency 
(o2  >  1  due  to  the  action  of  the  spring. 


<12 


Figure  80  Configuration  space  of  the  connected  pendulums 


Figure  81  Characteristic  oscillations  of  the  connected  pendulums 
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Now  let  the  spring  be  very  weak:  j  <  1,  Then  an  interesting  effect  called  exchange  of  energy 
occurs. 

Example  2.  Suppose  that  the  pendulums  are  at  rest  at  the  initial  moment,  and  one  of  them  is 
given  velocity  qt  =  v.  We  will  show  that  after  some  time  T  the  first  pendulum  will  be  almost 
stationary,  and  all  the  energy  will  have  gone  to  the  second. 

It  follows  from  the  initial  conditions  that  g,(0)  =  Q2( 0)  =  0.  Therefore,  =  c,  sin  r,  and 

Q2  =  c 2  sin  mt  with  co  =  +  2a  %  1  +  a  (a  1).  But  (^(O)  =  Q2( 0)  =  vjJl.  Therefore, 

ri  .  r  v 

Ci  =  v/\/2  and  c2  =  v/co^/2,  and  our  solution  has  the  form 

v(.  1  .  \  vf.  1  . 

q i  =  -  sin  f  H —  sin  cut  1  q2  =  -  sin  t - sin  cot 

2  \  co  /  2  \  w 

or.  disregarding  the  term  r(l  —  (l/oi))sin  cot,  which  is  small  since  a  is, 

v 

(Ji  ~  -  (sin  r  +  sin  cot)  =  v  cos  et  sin  u>'t, 
v 

q2  ~  -  (sin  t  -  sin  cot)  =  -  v  cos  co  t  sin  er, 
co  —  1  a  a)  +  1 

£.  r  CO  —  “  ^  1 , 

2  2  2 

The  quantity  e  %  a/2  is  small,  since  a  is;  therefore  q l  undergoes  an  oscillation  of  frequency 
co'  %  1  with  slowly  changing  amplitude  v  cos  er  (Figure  82). 

After  time  T  =  njlt,  %  rr/a,  essentially  only  the  second  pendulum  will  be  oscillating;  after 
27,  again  only  the  first,  etc.  (“beats”)  (Figure  83). 
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Figure  84  Connected  pendulums 


Figure  85  Potential  energy  of  strongly  connected  pendulums 


Example  3.  We  investigate  the  characteristic  oscillations  of  two  different  pendulums  (m,  #  m2, 
/,  /  i2,  g  =  1),  connected  by  a  spring  with  energy  2a{q ,  —  q2)2  {Figure  84).  How  do  the  charac¬ 
teristic  frequencies  behave  as  a  -*  0  or  as  a  ->  oo  ? 

We  have 


T  —  jCmll2qj  +  m2/2  4f) 

U  —  ntjli  ^  +  m2I2  —  +  ~  (q i  —  92 )2- 


Therefore  (Figure  85), 


o 

\  0  m2l2 


B  = 


m1l1  +  cl 
—  cl 


m2/2  +  «/ 


and  the  characteristic  equation  has  the  form 


det(B  —  kA)  = 


+  cl  —  kmill 
—  cl 


—  CL 

m2l2  +  ot  —  km2l\ 


=  0 


or 


ak2  —  ( b0  +  b1ix)k  +  (c0  +  CiCl)  =  0, 


where 

a  =  mlm2l2l\ 

h0  =  m1llm2l2(li  +  l2 )  h,  =  m,l \  +  m2l\ 

c0  =  mim2lil2  Ci  =  +  m2l2. 

This  is  the  equation  of  a  hyperbola  in  the  (x,  A)-plane  (Figure  86).  As  a  -*  0  (weak  spring)  the 
frequencies  approach  the  frequencies  of  free  pendulums  (caj  2  =  /K2);  as  cl  -*  x.  one  of  the 
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a 


Figure  86  Dependence  of  characteristic  frequencies  on  the  stiffness  of  the  spring 


Figure  87  Limiting  case  of  pendulums  connected  by  an  infinitely  stiff  spring 


frequencies  tends  to  x,  while  the  other  approaches  the  characteristic  frequency  cox  of  a  pendu¬ 
lum  with  two  masses  on  one  rod  (Figure  87): 

2  _  m,/i  +  m2l2 
”  m,lf  +  m2l22' 

Problem.  Investigate  the  characteristic  oscillations  of  a  planar  double  pendulum  (Figure  88). 

Problem.  Find  the  shape  of  the  trajectories  of  the  small  oscillations  of  a  point  mass  on  the  plane, 
sitting  inside  an  equilateral  triangle  and  connected  by  identical  springs  to  the  vertices  (Figure  89). 


m2 


Figure  88  Double  pendulum 


Figure  89  System  with  an  infinite  set  of  characteristic  oscillations 
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Solution.  Under  rotation  by  120°  the  system  is  mapped  onto  itself.  Consequently,  all  direc¬ 
tions  are  characteristic,  and  both  characteristic  frequencies  are  the  same:  U  =  }a>2(x2  +  y2). 
Therefore,  the  trajectories  are  ellipses  (cf.  Figure  20). 


24  Behavior  of  characteristic  frequencies 

We  prove  here  the  Rayleigh-Courant-Fisher  theorem  on  the  behavior  of  characteristic  fre¬ 
quencies  of  a  system  under  increases  in  rigidity  and  under  imposed  constraints. 

A  Behavior  of  characteristic  frequencies  under  a 
change  in  rigidity 

Consider  a  system  performing  small  oscillations,  with  kinetic  and  potential 
energies 

T  =  ^4q,  q)  >  0  and  U  =  i(Bq,  q)  >  0  for  all  q,  q  ^  0. 

Definition.  A  system  with  the  same  kinetic  energy,  and  a  new  potential  energy 
17',  is  called  more  rigid  if  U'  =  j(B' q,  q)  >  j(Bq,  q)  =  U  for  all  q. 

We  wish  to  understand  how  the  characteristic  frequencies  change  under 
an  increase  in  the  rigidity  of  a  system. 

Problem.  Discuss  the  one-dimensional  case. 

Theorem  1.  Under  an  increase  in  rigidity,  all  the  characteristic  frequencies 
are  increased,  i.e.,  ifcDl<a)2<--<a)n  are  the  characteristic  frequencies 
of  the  less  rigid  system,  and  co\  <  co2  <  •  •  ■  <  (o'„  are  the  characteristic 
frequencies  of  the  more  rigid  system,  then  o)1  <  a>\‘,  co2  <  a>'2; . . . ;  a>„  <  o)'n. 

This  theorem  has  a  simple  geometric  meaning.  Without  loss  of  generality 
we  may  assume  that  A  =  E,  i.e.,  that  we  are  considering  the  euclidean  struc¬ 
ture  given  by  the  kinetic  energy  T  =  £(q,  q).  To  each  system  we  associate  the 
ellipsoids  E  :  (Bq,  q)  =  1  and  (B'q,  q)  =  1. 

It  is  clear  that 

Lemma  1.  If  the  system  U'  is  more  rigid  than  U,  then  the  corresponding 
ellipsoid  E1  lies  inside  E. 

It  is  also  clear  that 


Lemma  2.  The  major  semi-axes  of  the  ellipsoid  are  the  inverses  of  the  char¬ 
acteristic  frequencies  =  1/a,-. 

Therefore,  Theorem  1  is  equivalent  to  the  following  geometric  proposition 
(Figure  90). 
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Figure  90  The  semi*axes  of  the  inside  ellipse  are  smaller. 

Theorem  2.  If  the  ellipsoid  E  with  semi-axes  at  >  a2  >  ■  •  •  >  a„  contains  the 
ellipsoid  E'  with  semi-axes  a\  >  a'2  >  ■  •  ■  >  an,  both  ellipses  having  the 
same  center ,  then  the  semi-axes  of  the  inside  ellipsoid  are  smaller: 

a\  >  a\,a2  >  a2, . . . ,  a„  >  a'„. 

Example.  Under  an  increase  in  the  rigidity  a  of  the  spring  connecting  the  pendulums  of  Example 
3,  Section  23,  the  potential  energy  grows,  and  by  Theorem  1,  the  characteristic  frequencies  grow: 
dwjdct  >  0. 

Now  consider  the  case  when  the  rigidity  of  the  spring  approaches  infinity,  a  -►  x.  Then  in 
the  limit  the  pendulums  are  rigidly  connected  and  we  get  a  system  with  one  degree  of  freedom; 
the  limiting  characteristic  frequency  co^  satisfies  cu,  <  <  co2. 

B  Behavior  of  characteristic  frequencies  under  the 
imposition  of  a  constraint 

We  return  to  a  general  system  with  n  degrees  of  freedom,  and  let  T  =  ^{q,  q) 
and  U  =  i(Bq,  q)  (q  e  R")  be  the  kinetic  and  potential  energies  of  a  system 
performing  small  oscillations. 


Figure  91  Linear  constraint 


Let  1R"-1  <=  R"  be  an  ( n  —  l)-dimensional  subspace  in  R"  (Figure  91). 
Consider  the  system  with  n  —  1  degrees  of  freedom  (q  e  R"“ !)  whose  kinetic 
and  potential  energies  are  the  restrictions  of  T  and  U  to  R"“ l.  We  say  that 
this  system  is  obtained  from  the  original  by  imposition  of  a  linear  constraint. 

Let  o>!  <  a>2  <  •  ■  •  <  ton  be  the  n  characteristic  frequencies  of  the  original 
system,  and 


0)\  <  (ti'2  <  ■  ■  •  <  co'„  _ ! 

the  (n  —  1)  characteristic  frequencies  of  the  system  with  a  constraint. 
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oo  j  002 

— •  <D  #  €>  <i>  # - 

00]  002  t'J'l  —  I 

Figure  92  Separation  of  frequencies 

Theorem  3.  The  characteristic  frequencies  of  the  system  with  a  constraint 
separate  the  characteristic  frequencies  of  the  original  system  ( Figure  92): 

co1  <  <  co 2  <  to '2  <■■•  <  0)„_  !  <  o)'n  _ !  <  con. 

By  Lemma  2  this  theorem  is  equivalent  to  the  following  geometric  propo¬ 
sition. 

Theorem  4.  Consider  the  cross-section  of  the  n-dimensional  ellipsoid  E  = 
{q:  (Bq,  q)  =  1}  with  semi-axes  ax  >  a2  >■■■>  an  by  a  hyperplane  IR"-1 
through  its  center.  Then  the  semi-axes  of  this  (n  —  1  )-dimensional  ellip¬ 
soid— the  cross-section  E'— separate  the  semi-axes  of  the  ellipsoid  E' 
( Figure  93): 

ax  >  a\  >  a2  >  a’2  >  ■  •  ■  >  a„_  x  >  a'„_  x  >  a„. 


Figure  93  The  semi-axes  of  the  intersection  separate  the  semi-axes  of  the  ellipsoid 


C  Extremal  properties  of  eigenvalues 

Theorem  5.  The  smallest  semi-axis  of  any  cross-section  of  the  ellipsoid  E  with 
semi-axes  ax  >  a2  >■•■>  an  by  a  subspace  R*  is  less  than  or  equal  to  ak: 

ak  —  max  min  ||x|| 

{IRk}  xERkn£ 

(the  upper  bound  is  attained  on  the  subspace  spanned  by  the  semi-axes 
ax  ^  a2  ^  ak). 

Proof.43  Consider  the  subspace  R,,_'c+ 1  spanned  by  the  axes  ak  >  ak+1  >  ■  ■■ 
>  a„.  Its  dimension  is  n  —  k  +  1.  Therefore,  it  intersects  R\  Let  x  be  a  point 
of  the  intersection  lying  on  the  ellipsoid.  Then  j|x||  <  ak,  since  xe  R"_fc+1. 

43  It  is  useful  to  think  of  the  case  n  =  3,  k  =  2. 
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Since  /  <  ||x||,  where  l  is  the  length  of  the  smallest  semi-axis  of  the  ellipsoid 
E  n  Rk,  /  must  be  no  larger  than  ak .  □ 

Proof  of  theorem  2.  The  smallest  semi-axis  of  every  Ar-dimensional 
section  of  the  inner  ellipsoid  R*  n  E  is  less  than  or  equal  to  the  smallest 
semi-axis  of  Rfc  n  E.  By  Theorem  5, 

a'k  -  max  min  ||x||  <  max  min  ||x||  —  ak.  □ 

{(»k)  x  e  0Sk  n  £'  {(Rk}  xeRkn£ 

Proof  of  theorem  4.  The  inequality  ak  <  ak  follows  from  Theorem  5, 
since  in  the  calculation  of  ak  the  maximum  is  taken  over  a  larger  set.  To  prove 
the  inequality  a'k>  ak+  lt  we  intersect  R""1  with  any  k  +  1 -dimensional 
subspace  Rt+1.  The  intersection  has  dimension  greater  than  or  equal  to  k. 
The  smallest  semi-axis  of  the  ellipsoid  E  n  R*+ 1  is  greater  than  or  equal  to 
the  smallest  semi-axis  of  E  n  Rfc  + 1.  By  Theorem  5, 

a'k  =  max  min  ||x||  >  max  min  ||x|| 

(|Rk  c  K"-1}  xeKkn£'  {Kk  +  1  <=  K")  x  e  Rk  +  1  n  £' 

>  max  min  ||x||  =  ak  +  1.  □ 

(Rk  +  1  a  R")  x  e  B4k  +  1  n  £ 

Theorems  1  and  3  follow  directly  from  those  just  proven. 

Problem.  Show  that  if  we  increase  the  kinetic  energy  of  a  system  without 
decreasing  the  potential  energy  (for  example,  we  increase  the  mass  on  a  given 
spring),  then  every  characteristic  frequency  decreases. 


Problem.  Show  that  under  the  orthogonal  projection  of  an  ellipsoid  lying  in  one  subspace  of 
euclidean  space  onto  another  subspace,  all  the  semi-axes  are  decreased. 


Problem.  Suppose  that  a  quadratic  form  A(e)  on  euclidean  space  R"  is  a  continuously  differen¬ 
tiable  function  of  the  parameter  s.  Show  that  every  characteristic  frequency  depends  differen- 
tiably  on  e,  and  find  the  derivatives. 

Answer.  Let  . . . ,  A*  be  the  eigenvalues  of  A(O),  To  every  eigenvalue  2,  of  multiplicity  there 
corresponds  a  subspace  RV|.  The  derivatives  of  the  eigenvalues  of  A{ e)  at  0  are  equal  to  the 
eigenvalues  of  the  restricted  form  B  —  (dA/d£)\£=0  on  RV|. 

In  particular,  if  all  the  eigenvalues  of  A(0)  are  simple,  then  their  derivatives  are  equal  to  the 
diagonal  elements  of  the  matrix  B  in  the  characteristic  basis  for  .4(0). 

It  follows  from  this  problem  that  when  a  form  is  increased,  its  eigenvalues  grow.  In  this  way 
we  obtain  new  proofs  of  Theorems  1  and  2. 

Problem.  How  does  the  pitch  of  a  bell  change  when  a  crack  appears  in  the  bell? 


25  Parametric  resonance 

If  the  parameters  of  a  system  vary  periodically  with  time,  then  an  equilibrium  position  can  be 
unstable,  even  if  it  is  stable  for  each  fixed  value  of  the  parameter.  This  instability  is  what  makes  it 
possible  to  swing  on  a  swing. 
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A  Dynamical  systems  whose  parameters  vary 
periodically  with  time 

Example  1.  A  swing :  the  length  of  the  equivalent  mathematical  pendulum 
l{t )  varies  periodically  with  time:  Ut  +  T)  =  l(t)  (Figure  94). 


Example  2.  A  pendulum  in  a  periodically  varying  gravitational  field  (for 
example,  the  moon)  is  described  by  Hill’s  equation  : 

(1)  q  =  —co2(t)q  a)(t  +  T)  =  c o(t) 

Example  3.  A  pendulum  suspended  from  a  point  which  periodically  oscillates 
vertically  is  also  described  by  an  equation  of  the  form  (1). 

For  systems  with  periodically  varying  parameters  the  right-hand  side  of 
the  equations  of  motion  are  periodic  functions  of  t.  The  equations  of  motion 
can  be  written  in  the  form  of  a  system  of  first-order  ordinary  differential 
equations 

(2)  x  =  f(x,  t)  f(x,  t+T)  =  f(x,  t),  xeR" 

with  periodic  right-hand  sides.  For  example,  Equation  (1)  can  be  written  as 
the  system 

(3)  2  }o>(r  +  T)  =  co(t). 

*2  =  *lj 

B  The  mapping  at  a  period 

Recall  the  general  properties  of  the  system  (2).  We  denote  by  gl:  R"  ->  FT  the 
mapping  taking  x  e  R"  to  the  value  at  time  t,  g‘x  =  cp(t),  of  the  solution  tp  of 
system  (2)  with  initial  conditions  (p(0)  —  x  (Figure  95). 

The  mappings  gl  do  not  form  a  group:  in  general, 

g,+s ±  g'g*  *  gsg'- 

Problem.  Show  that  { g '}  is  a  group  if  and  only  if  the  right-hand  sides  f  do  not 
depend  on  t. 

Problem.  Show  that,  if  T  is  the  period  of  f,  then  gT  +  s  =  gs  •  gT  and,  in 
particular,  gnT  =  ( gT)n ,  so  that  the  mappings  gnT  ( n  an  integer)  form  a  group. 
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Figure  95  Mapping  at  a  period 

The  mapping  gT:  R"  -*■  R"  plays  an  important  role  in  what  is  to  come;  we 
will  call  it  the  mapping  at  a  period  and  will  denote  it  by 

/4 :  R"  -*■  R"  /4x(0)  =  x(T). 

Example.  For  the  systems 


which  can  be  considered  periodic  with  any  period  T,  the  mapping  A  is  a  rotation  or  a  hyper¬ 
bolic  rotation  (Figure  96). 

*2  x2 


Figure  96  Rotation  and  hyperbolic  rotation 


Theorem. 

1.  The  point  x0  is  a  fixed  point  of  the  mapping  A(Ax0  =  x0)  if  and  only  if  the 
solution  with  initial  conditions  x(0)  =  x0  is  periodic  with  period  T. 

2.  T  he  periodic  solution  x(t)  is  Liapunov  stable  ( asymptotically  stable )  if  and 
only  if  the  fixed  point  x0  of  the  mapping  A  is  Liapunov  stable  ( asymptoti¬ 
cally  stable).4* 

3.  If  the  system  (2)  is  linear,  i.e.,  f  (x,  t)  =  / (r)x  is  a  linear  function  of  x, 
then  A  is  linear. 

4.  If  the  system  (2)  is  hamiltonian,  then  A  preserves  volume:  det  A*  =  1. 

**  A  fixed  point  x0  of  the  mapping  A  is  Liapunov  stable  (respectively,  asymptotically  stable)  if 
Ve  >  0,  3d  >  0  such  that  if  |x  —  x0|  <  6,  then  |  A”x  —  A"x0|  <  e  for  all  0  <  n  <  x  (respec¬ 
tively,  A”x  —  Anx0  -+  0  as  n  -»  x). 
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Proof.  Assertions  (1)  and  (2)  follow  from  the  relationship  gT+s  =  gsA. 
Assertion  (3)  follows  from  the  fact  that  a  sum  of  solutions  of  a  linear  system 
is  again  a  solution.  Assertion  (4)  follows  from  Liouville’s  theorem.  □ 

We  apply  the  theorem  above  to  the  mapping  A  of  the  phase  plane  {(x! ,  x2)} 
onto  itself,  corresponding  to  the  equation  (1)  and  the  system  (3).  Since  (3)  is 
linear  and  hamiltonian  ( H  =  jco2x \  +  |x2),  we  get: 

Corollary.  The  mapping  A  is  linear,  and  preserves  area  (det  A  =  1).  The  trivial 
solution  of  Equation  (1)  £5  stable  if  and  only  if  the  mapping  A  is  stable. 

Problem.  Show  that  a  rotation  of  the  plane  is  a  stable  mapping,  and  a 
hyperbolic  rotation  is  unstable. 

C  Linear  mappings  of  the  plane  to  itself  which 
preserve  area 

Theorem.  Let  A  be  the  matrix  of  a  linear  mapping  of  the  plane  to  itself  which 
preserves  area  (det  A  —  1).  Then  the  mapping  A  is  stable  if  \tr  A  \  <  2,  and 
unstable  i/|tr  A  \  >  2  (tr  A  =  au  +  a22). 

Proof.  Let  Xx  and  A2  be  the  eigenvalues  of  A.  They  satisfy  the  characteristic 
equation  A2  —  (tr  A)A  +1  =  0  with  real  coefficients  Xl  +  A2  =  tr  A  and 
Xl  •  A2  =  det  >1  =  1.  The  roots  Xt  and  A2  of  this  real  quadratic  equation  are 
real  for  |tr  A  \  >  2  and  complex  conjugate  for  |  tr  A  \  <  2. 

In  the  first  case  one  of  the  eigenvalues  has  absolute  value  greater  than  1, 
and  one  has  absolute  value  less  than  1 ;  the  mapping  A  is  a  hyperbolic 
rotation  and  is  unstable  (Figure  97). 


Figure  97  Eigenvalues  of  the  mapping  A 


In  the  second  case  the  eigenvalues  lie  on  the  unit  circle  (Figure  97): 

1  =  X1  •  A2  =  Aj  ■X1  =  |  Aj  |2. 

The  mapping  A  is  equivalent  to  a  rotation  through  angle  a  (where  Ali2  = 
e  ±  “),  i.e.,  it  may  be  reduced  to  a  rotation  by  means  of  an  appropriate  choice  of 
coordinates  on  the  plane.  Therefore,  it  is  stable.  □ 

In  this  way,  every  question  about  the  stability  of  the  trivial  solution  of  an 
equation  of  the  form  (1)  is  reduced  to  computation  of  the  trace  of  the  matrix 
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A.  Unfortunately,  the  calculation  of  this  trace  can  be  done  explicitly  only  in 
special  cases.  It  is  always  possible  to  find  the  trace  approximately  by  numeri¬ 
cally  integrating  the  equation  on  the  interval  0  <  t  <  T.  In  the  important 
case  when  co(t)  is  close  to  a  constant,  some  simple  general  arguments  can  help. 

D  Strong  stability 

Definition.  The  trivial  solution  of  a  hamiltonian  linear  system  is  strongly 
stable  if  it  is  stable,  and  if  the  trivial  solution  of  every  sufficiently  close 
linear  hamiltonian  system  is  also  stable.45 

The  two  theorems  above  imply: 

Corollary.  If  |  tr  A  |  <  2,  then  the  trivial  solution  is  strongly  stable. 

Proof.  If  |  tr  A  \  <2,  then  a  mapping  A'  corresponding  to  a  sufficiently  close 
system  will  also  have  |  tr  A'\  <  2.  □ 

Let  us  apply  this  to  a  system  with  almost  constant  (only  slightly  varying) 
coefficients.  Consider,  for  example,  the  equation 

(4)  x  =  —  to2(  1  4-  sa(t))x,  s  -4  1 

where  a(t  +  27t)  =  a(t),  e.g.,  a(t)  =  cos  t  (Figure  98)  (a  pendulum  whose 
frequency  oscillates  near  to  with  small  amplitude  and  period  27t) 46 


We  will  represent  each  system  of  the  form  (4)  by  a  point  in  the  plane  of 
parameters  e,  to  >  0.  Clearly,  the  stable  systems  with  |  tr  A  \  <2  form  an 
open  set  in  the  (to,  s)-plane;  so  do  the  unstable  systems  with  |  tr  A  \  >  2 
(Figure  99). 

The  boundary  of  stability  is  given  by  the  equation  |  tr  A  |  =  2. 

Theorem.  All  points  on  the  co-axis  except  the  integers  and  half-integers 
co  =  kf 2,  k  —  0,  1,  2, . . .  correspond  to  strongly  stable  systems  (4). 


45  The  distance  between  two  linear  systems  with  periodic  coefficients,  x  =  x  =  B2(t)x, 

is  defined  as  the  maximum  over  t  of  the  distance  between  the  operators  B,(f)  and  B2(t). 

46  In  the  case  a(r)  =  cos  r.  Equation  (4)  is  called  Mathieus  equation. 
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e 


Figure  99  Zones  of  parametric  resonance 


Thus,  the  set  of  unstable  systems  can  approach  the  co-axis  only  at  the 
points  co  =  kj 2.  In  other  words,  swinging  a  swing  by  small  periodic  changes 
of  the  length  is  possible  only  in  the  case  when  one  period  of  the  change  in 
length  is  close  to  a  whole  number  of  half-periods  of  characteristic  oscillations 
—a  result  well  known  experimentally. 

The  proof  of  the  theorem  above  is  based  on  the  fact  that  for  e  =  0,  Equation 
(4)  has  constant  coefficients  and  is  clearly  solvable. 

Problem.  Calculate  the  matrix  of  the  transformation  A  after  period  T  =  2n 
in  the  basis  x,  x  for  system  (4)  with  e  =  0. 

Solution.  The  general  solution  is: 

x  =  Ci  cos  cot  +  c2  sin  cot. 

The  solution  with  initial  conditions  x  =  1,  x  —  0  is: 

x  =  cos  cot  x  =  —  co  sin  cot. 

The  solution  with  initial  conditions  x  =  0,  x  =  1  is: 

1 

x  =  —  sin  cot  x  —  cos  cot. 
co 

Answer. 

1 

cos  27ict)  —  sin  2nco 
A  =  co  . 

—  ft)  sin  2nco  cos  2nco 

Therefore,  j tr  A  \  =  \2  cos  2con\  <2  if  co  ^  kj 2,  k  =  0,  1,  . . . ,  and  the 
theorem  follows  from  the  preceding  corollary. 

A  more  careful  analysis47  shows  that  in  general  (and  for  a(t )  =  cos  t) 
the  region  of  instability  (shaded  in  Figure  99)  in  fact  approaches  the  co-axis 
near  the  points  co  =  /c/2,  k  =  1,2,.... 

47  Cf.,  for  example,  the  problem  analyzed  below. 
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Thus,  for  to  zz  kj 2,  k  =  1,  2,  . . . ,  the  lowest  equilibrium  position  of  the 
idealized  swing  (4)  is  unstable  and  it  swings  under  an  arbitrarily  small 
periodic  change  of  length.  This  phenomenon  is  called  parametric  resonance. 
A  characteristic  property  of  parametric  resonance  is  that  it  is  strongest  when 
the  frequency  of  the  variation  of  the  parameter  v  (in  Equation  (4),  v  =  1) 
is  twice  the  characteristic  frequency  to. 

Remark.  Theoretically,  parametric  resonance  can  be  observed  for  the 
infinite  collection  of  cases  to/r  %  k/2,  k  =  1,2,  —  In  practice,  it  is  usually 
observed  only  when  k  is  small  (fc  =  1,2,  and  more  rarely,  3).  The  reason  is 
that: 

1 .  For  large  k  the  region  of  instability  approaches  the  co-axis  in  a  very  narrow 
"tongue”  and  the  resonance  frequencies  to  must  satisfy  very  rigid  bounds 
( ~  s9k,  where  6  e  (0,  1)  depends  on  the  width  of  the  analyticity  band  for  the 
function  a(t)  in  (4)). 

2.  The  instability  itself  is  weak  for  large  k,  since  |  tr  A  \  -  2  is  small  and  the 
eigenvalues  are  close  to  1  for  large  k. 

3.  If  there  is  an  arbitrarily  small  amount  of  friction,  then  there  is  a  minimal 
value  sk  of  the  amplitude  in  order  for  parametric  resonance  to  begin  (for  e 
less  than  this  the  oscillation  dies  out).  As  k  grows,  ek  grows  quickly  (Figure 
100). 


w 


Figure  100  Influence  of  friction  on  parametric  resonance 

We  also  notice  that  for  Equation  (4)  the  size  of  x  grows  without  bound  in 
the  unstable  case.  In  real  systems,  oscillations  attain  only  finite  amplitudes, 
since  for  large  x  the  linear  equation  (4)  itself  loses  influence,  and  we  must 
consider  the  nonlinear  effects. 

Problem.  Find  the  shape  of  the  region  of  stability  in  the  f:,w-p!ane  for  the  system  described  by 
the  equations 

,  (oj  +  t:  0  <  r  <  ?r 

.x  =  -f2(t)x  f(t)  =  <  r.  -4  1 

(.(m  —  i:  n  <  i  <  2n 

f(t  +  271)  =  /'(/). 

Solution.  It  follows  from  the  solution  of  the  preceding  problem  that  A  =  A2Ai.  where 

1 

Ck  ~Sk 
(O 

Ck 

ck  =  cos  tto)*,  sk  =  sin  nojk,  a>li2  =  <o  ±  i:. 
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Therefore,  the  boundary  of  the  zone  of  stability  has  the  equation 


(5) 


I  tr  A I  = 


2c1c2 


CO- 


S  1^2 


=  2. 


Since  e  1,  we  have  co1/co2  =  (a>  +  £)/(to  —  s)  «  1.  We  introduce  the  notation 

co,  co2 

—  +  —  =  2(1  +  A). 
co2 

Then,  as  is  easily  computed,  A  =  (2 e2/to2)  +  0(e4)  <  1.  Using  the  relations  2 c,c2 
cos  2ns  +  cos  2nco  and  2s!s2  =  cos  2ne  —  cos  2nco,  we  rewrite  Equation  (5)  in  the  form 


—  A  cos  2ti£  +  (2  +  A)cos  2nco  =  ±2 


or 

(6a) 

(6b) 


2  +  A  cos  2m 

cos  27ta>  = - 

2  +  A 

—  2  +  A  cos  27te 

cos  2nco  — - 

2  +  A 


In  the  first  case  cos  2ttco  sr  1,  Therefore,  we  set 

a)  =  k  +  a,\a |  <  1  cos  2rrco  =  cos  2 na  =  1  -  2n1a1  +  0(a4). 
We  rewrite  Equation  (6a)  in  the  form 

A 

cos  2m)  =  1 - (1  —  cos  2m) 

2  +  A 


or  2n2a2  +  0(a4)  =  Atz2e2  +  0(e4). 

Substituting  in  the  value  A  =  (2e2/<o2)  +  0(e4),  we  find 


£2  C2 

a  =  +  — T  +  o(e2),  i.e.,  to  =  k  ±  —  +  o(e2). 
co*  k 


Equation  (6b)  is  solved  analogously;  for  the  result  we  get 

1  e 

to  =  k  +  -  ±  — - -  +  o(e). 

2  t i(k  +  i) 


Therefore  the  answer  has  the  form  depicted  in  Figure  101. 


e 


Figure  101  Zones  of  parametric  resonance  for  /  =  co  ±  6. 
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E  Stability  of  an  inverted  pendulum  with  vertically 
oscillating  point  of  suspension 

Problem.  Can  the  topmost,  usually  unstable,  equilibrium  position  of  a 
pendulum  become  stable  if  the  point  of  suspension  oscillates  in  the  vertical 
direction  (Figure  102)? 


Figure  102  Inverted  pendulum  with  oscillating  point  of  suspension 

Let  the  length  of  the  pendulum  be  /,  the  amplitude  of  the  oscillation  of  the 
point  of  suspension  be  a  <$  l,  the  period  of  oscillation  of  the  point  of  suspen¬ 
sion  2t,  and,  moreover,  in  the  course  of  every  half-period  let  the  acceleration 
of  the  point  of  suspension  be  constant  and  equal  to  ±c  (then  c  =  8 a/t2).  It 
turns  out  that  for  fast  enough  oscillations  of  the  point  of  suspension  (t  1) 
the  topmost  equilibrium  becomes  stable. 


Solution.  The  equation  of  motion  can  be  written  in  the  form  x  =  (co2  ±  d2)x  (the  sign  changes 
after  time  r),  where  to2  =  g/l  and  d 2  =  cjl.  If  the  oscillation  of  the  suspension  is  fast  enough, 
then  d2  >  a>2  (d1  =  Sa/lx2). 

As  in  the  previous  problem,  A  —  A2A,,  where 


A  i  = 

ch  kx 

-  sh  kx 
k 

a2  = 

cos  ftr 

—  sin  ftt 
ft 

k  sh  kx 

ch  kx 

—  ft  sin  fti 

cos  fti 

k2  =  d2  +  to2,  ft2  =  d2  -  co2. 

The  stability  condition  |tr  A  |  <  2  therefore  has  the  form 


(7) 


k  ft 

2  ch  /cr  cos  fti  ■+■  ( - —  1  sh  kx  sin  fti 

ft  k 


<  2 


We  will  show  that  this  condition  is  fulfilled  for  sufficiently  fast  oscillations  of  the  point  of 
suspension,  i.e.,  when  c  >  g.  We  introduce  the  dimensionless  variables  e,  g  : 


=  £2  4  1 


=  H2  4  1. 
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Then 


kr  =  2v^v/TT7  fir  =  2-Jl  EyJ\  —  y2 


k  fi 

n~k 


n+n2 


1  -n2 


=  In2  +  0(y*). 


'  1  -  y2  V 1  +  y1 

Therefore,  for  small  c  and  y  we  have  the  following  expansion  with  error  o(s4  +  y*): 
ch  kr  =  1  +  4e2(1  +  y2)  +  fc4  +  ■  ■  ■  cos  fir  =  1  —  4e2(l  —  y2)  +  fe4  +  ■  ■  • 

k  n\ 

— - sh  fcr  sin  fir  =  I6e2y2  +  ■  ■  • 

il  k  / 


so  the  stability  condition  (7)  takes  the  form 

2(1  -  16e4  +  ^e4  +  8eV  +  --•)+  16eV  <  2, 

i.e.,  disregarding  the  small  higher-order  terms,  f  16e4  >  32y2e2  or  y  <  t,j2j3,  or  gjc  <  2a/3l. 
This  condition  can  be  rewritten  as 


N  > 


0.22  (o- 


where  N  —  l/2r  is  the  number  of  oscillations  of  the  point  in  one  unit  of  time.  For  example,  if  the 
length  of  the  pendulum  l  is  20  cm,  and  the  amplitude  of  the  oscillation  of  the  point  of  suspension 
a  is  1  cm,  then 


N  >  0.22 


31  (oscillations  per  second). 


For  example,  the  topmost  position  is  stable  if  the  frequency  of  oscillation  of  the  point  of 
suspension  is  greater  than  40  pier  second. 
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Rigid  bodies 

6 


In  this  chapter  we  study  in  detail  some  very  special  mechanical  problems. 
These  problems  are  traditionally  included  in  a  course  on  classical  mechanics, 
first  because  they  were  solved  by  Euler  and  Lagrange,  and  also  because  we 
live  in  three-dimensional  euclidean  space,  so  that  most  of  the  mechanical 
systems  with  a  finite  number  of  degrees  of  freedom  which  we  are  likely  to 
encounter  consist  of  rigid  bodies. 


26  Motion  in  a  moving  coordinate  system 


In  this  paragraph  we  define  angular  velocity. 

A  Moving  coordinate  systems 

We  look  at  a  lagrangian  system  described  in  coordinates  q,  t  by  the  lagrangian 
function  L(q,  q,  t).  It  will  often  be  useful  to  shift  to  a  moving  coordinate 
system  Q  =  Q(q,  t). 

To  write  the  equations  of  motion  in  a  moving  system,  it  is  sufficient  to 
express  the  lagrangian  function  in  the  new  coordinates. 

Theorem.  If  the  trajectory  y:  q  =  <p(f)  of  Lagrange's  equations  d(8L/8q)/dt  = 
cL/cq  is  written  as  y:  Q  =  <I>(t)  in  the  local  coordinates  Q,  t  ( where  Q  = 
Q(q,  f)),  then  the  function  <S>(t)  satisfies  Lagrange's  equations  d(8L’/8Q)/dt  — 
8L'/8Q,  where  L'( Q,  Q,  t )  =  L(q,q,  t ). 

Proof.  The  trajectory  y  is  an  extremal :  5  jy  L(q,  q,  t)dt  =  0.  Therefore, 
8  Jy  L'(Q,  Q,  t)dt  =  0  and  <D(r)  satisfies  Lagrange’s  equations.  □ 
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B  Motions ,  rotations,  and  translational  motions 

We  consider,  in  particular,  the  important  case  where  q  is  the  cartesian  radius 
vector  of  a  point  relative  to  an  inertial  coordinate  system  k  (which  we  will 
call  stationary ),  and  Q  is  the  cartesian  radius  vector  of  the  same  point  relative 
to  a  moving  coordinate  system  K. 

Definition.  Let  k  and  K  be  oriented  euclidean  spaces.  A  motion  of  K  relative 
to  k  is  a  mapping  smoothly  depending  on  t  : 

Dt  :  K  -*  k, 

which  preserves  the  metric  and  the  orientation  (Figure  103). 


L 

Figure  103  The  motion  D,  decomposed  as  the  product  of  a  rotation  B,  and  transla¬ 
tion  C, 

Definition.  A  motion  D,  is  called  a  rotation  if  it  takes  the  origin  of  K  to  the 
origin  of  k,  i.e.,  if  D,  is  a  linear  operator. 

Theorem.  Every  motion  D,  can  be  uniquely  written  as  the  composition  of  a 
rotation  By  K  -»  k  and  a  translation  Cy.  k  ->  k: 

Dt  =  CtBt, 

where  C, q  =  q  +  r(r),  (q,  re k). 

Proof.  We  set  r(r)  =  D, 0,  B,  =  Cf'Dt.  Then  BtQ  =  0.  □ 

Definition.  A  motion  D,  is  called  translational  if  the  mapping  By.  K  -*  k 
corresponding  to  it  does  not  depend  on  t:Bt  —  B0  =  B,D(Q  =  BQ  +  r (t). 

We  will  call  k  a  stationary  coordinate  system,  K  a  moving  one,  and 
q(r)  g  k  the  radius-vector  of  a  point  moving  relative  to  the  stationary  system; 
if 

(1)  q(0  =  D,Q(t)  =  BtQ(t )  +  r(r) 

(Figure  104),  Q(r)  is  called  the  radius  vector  of  the  point  relative  to  the  moving 
system. 

Warning.  The  vector  BtQ(t)ek  should  not  be  confused  with  Q(f)e  K  — 
they  lie  in  different  spaces! 
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Figure  104  Radius  vector  of  a  point  with  respect  to  stationary  (q)  and  moving  (Q) 
coordinate  systems 


C  Addition  of  velocities 

We  will  now  express  the  “absolute  velocity”  q  in  terms  of  the  relative  motion 
Q(t)  and  the  motion  of  the  coordinate  system,  Dt.  By  differentiating  with 
respect  to  t  in  formula  (1)  we  find  a  formula  for  the  addition  of  velocities 

(2)  q  =  BQ  +  BQ  +  r. 

In  order  to  clarify  the  meaning  of  the  three  terms  in  (2),  we  consider  the 
following  special  cases. 

The  case  of  translational  motion  ( B  =  0) 

In  this  case  Equation  (2)  gives  q  =  BQ  +  r.  In  other  words,  we  have  shown 

Theorem.  If  the  moving  system  K  has  a  translational  motion  relative  to  k ,  then 
the  absolute  velocity  is  equal  to  the  sum  of  the  relative  velocity  and  the 
velocity  of  the  motion  of  the  system  K : 

(3)  v  =  v'  +  v0, 

where 

v  =  q  e  k  is  the  absolute  velocity , 

V  =  BQsk  is  the  relative  velocity  ( distinct  from  Q  e  Kl) 

v0  =  r  e  k  is  the  velocity  of  motion  of  the  moving  coordinate  system. 

D  Angular  velocity 

In  the  case  of  a  rotation  of  K  the  relationship  between  the  relative  and  ab* 
solute  velocities  is  not  so  simple.  We  first  consider  the  case  when  our  point  is 
at  rest  in  K  (i.e.,  Q  =  0)  and  the  coordinate  system  K  rotates  (i.e.,  r  =  0). 
In  this  case  the  motion  of  the  point  q (t)  is  called  a  transferred  rotation. 


Example.  Rotation  with  fixed  angular  velocity  c oek.  Let  U(t)  :  k  -»  k  be  the 
rotation  of  the  space  k  around  the  co-axis  through  the  angle  |co|f.  Then 
B(t)  =  U(t)B(0)  is  called  a  uniform  rotation  of  K  with  angular  velocity  to. 
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Figure  105  Angular  velocity 

Clearly,  the  velocity  of  the  transferred  motion  of  the  point  q  in  this  case  is 
given  by  the  formula  (Figure  105) 

q  =  [o>,  q] 

We  now  turn  to  the  general  case  of  a  rotation  of  K  (r  =  0,  Q  =  0). 

Theorem.  At  every  moment  of  time  t,  there  is  a  vector  at(t)  e  k  such  that  the 
transferred  velocity  is  expressed  by  the  formula 

(4)  q  =  [to,  q],  Vq  e  k. 

The  vector  o>  is  called  the  instantaneous  angular  velocity,  clearly,  it  is 
defined  uniquely  by  Equation  (4). 

Corollary.  Suppose  that  a  rigid  body  K  rotates  around  a  stationary  point  0  of 
the  space  k.  Then  at  every  moment  of  time  there  exists  an  instantaneous  axis 
of  rotation — the  straight  line  in  the  body  passing  through  0  such  that  the 
velocity  of  its  points  at  the  given  moment  of  time  is  equal  to  zero.  The 
velocity  of  the  remaining  points  is  perpendicular  to  this  straight  line  and  is 
proportional  to  the  distance  from  it. 

The  instantaneous  axis  of  rotation  in  k  is  given  by  its  vector  to;  in  K  the 
corresponding  vector  is  denoted  by  ft  =  B~  1co  e  K;  ft  is  called  the  vector  of 
angular  velocity  in  the  body. 


Example.  The  angular  velocity  of  the  earth  is  directed  from  the  center  to  the  North  Pole;  its 
length  is  equal  to  27r/3600  ■  24  sec  1  a  7.3  •  10”  5  sec-1. 


Proof  of  the  theorem.  By  (2)  we  have 

q  =  BQ. 

Therefore,  if  we  express  Q  in  terms  of  q,  we  get  q  =  BB~X q  =  /4q,  where 
A  =  £5“ 1 :  /c  -►  /c  is  a  linear  operator  on  k. 
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Lemma  1.  The  operator  A  is  skew-symmetric :  A1  +  A  =  0. 

Proof.  Since  B :  K  -►  k  is  an  orthogonal  operator  from  one  euclidean  space 
to  another,  its  transpose  is  its  inverse:  Bl  =  B~l\k  -*  K.  By  differentiating 
the  relationship  BB'  =  E  with  respect  to  t,  we  get 

BB'  +  BBf  =  0  BB~ 1  +  (BB" 1)'  =  0.  □ 


Lemma  2.  Every  skew-symmetric  operator  A  on  a  three-dimensional  oriented 
euclidean  space  is  the  operator  of  vector  multiplication  by  a  fixed  vector: 

Aq  =  [to,  q]  for  all  q  e  R3. 

Proof.  The  skew-symmetric  operators  from  IR3  to  R3  form  a  linear  space. 
Its  dimension  is  3,  since  a  skew-symmetric  3x3  matrix  is  determined  by  its 
three  elements  below  the  diagonal. 

The  operator  of  vector  multiplication  by  to  is  linear  and  skew-symmetric. 
The  operators  of  vector  multiplication  by  all  possible  vectors  to  in  three- 
space  form  a  linear  subspace  of  the  space  of  all  skew-symmetric  operators. 

The  dimension  of  this  subspace  is  equal  to  3.  Therefore,  the  subspace  of 
vector  multiplications  is  the  space  of  all  skew-symmetric  operators.  □ 


Conclusion  of  the  proof  of  the  theorem.  By  Lemmas  1  and  2, 


q  =  Aq  =  [to,  q]. 


□ 


In  cartesian  coordinates  the  operator  A  is  given  by  an  antisymmetric 
matrix;  we  denote  its  elements  by  +  a>123: 


f  0  -cu3  co2 


A  = 


oj3  0 


—  o>2  a>i 


i 

0 


In  this  notation  the  vector  to  =  +  <y2e2  -I-  «3e3  will  be  an  eigenvector 

with  eigenvalue  0.  By  applying  A  to  the  vector  q  =  q1el  +  q2e 2  +  g3e3, 
we  obtain  by  a  direct  calculation 

Aq  =  [to,  q]. 


E  Transferred  velocity 

The  case  of  purely  rotational  motion 

Suppose  now  that  the  system  K  rotates  (r  =  0),  and  that  a  point  in  K 
is  moving  (Q  #  0).  From  (2)  we  find  (Figure  106) 

q  =  BQ  -I-  BQ  =  [to,  q]  +  v'. 

In  other  words,  we  have  shown 
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CJ 


Figure  106  Addition  of  velocities 

Theorem.  If  a  moving  system  K  rotates  relative  to  Oek,  then  the  absolute 
velocity  is  equal  to  the  sum  of  the  relative  velocity  and  the  transferred 
velocity: 

V  =  V'  +  v„, 

where 

v  =  q  ek  is  the  absolute  velocity 
(5)  v'  =  BQ  g  k  is  the  relative  velocity 

=  BQ  =  [to,  q]  £  k  is  the  transferred  velocity  of  rotation. 

Finally,  the  general  case  can  be  reduced  to  the  two  cases  above,  if  we 
consider  an  auxiliary  system  K1  which  moves  by  translation  with  respect  to 
k  and  with  respect  to  which  K  moves  by  rotating  around  OeA^.  From 
formula  (2)  one  can  see  that 

V  =  v'  +  v„  +  v0, 

where 

v  =  q  e  k  is  the  absolute  velocity, 
v'  =  BQ  e  k  is  the  relative  velocity, 

v„  =  BQ  =  [to,  q  —  r]  e  k  is  the  transferred  velocity  of  rotation, 

and 

v0  =  r  g  k  is  the  velocity  of  motion  of  the  moving  coordinate  system. 

Problem.  Show  that  the  angular  velocity  of  a  rigid  body  does  not  depend  on 
the  choice  of  origin  of  the  moving  system  K  in  the  body. 

Problem.  Show  that  the  most  general  movement  of  a  rigid  body  is  a  helical 
movement,  i.e.,  the  composition  of  a  rotation  through  angle  cp  around  some 
axis  and  a  translation  by  h  along  it. 

Problem.  A  watch  lies  on  a  table.  Find  the  angular  velocity  of  the  hands  of  the  watch:  (a)  relative 
to  the  earth,  (b)  relative  to  an  inertia!  coordinate  system. 
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Hint.  If  we  are  given  three  coordinate  systems  k.  K,.  and  K2.  then  the  angular  velocity  of  K2 
relative  to  k  is  equal  to  the  sum  of  the  angular  velocities  of  X,  relative  to  k  and  of  K2  relative 
to  X,,  since 


(E  -+  A  jt  +  ■•’)(£  4-  A2t  4~  '■■)  —  E  4-  (A  i  4-  A  2  )t  4-  -  . 


27  Inertial  forces  and  the  Coriolis  force 

The  equations  of  motion  in  a  non-inertial  coordinate  system  differ  from  the  equations  of  motion 
in  an  inertial  system  by  additional  terms  called  inertial  forces.  This  allows  us  to  detect  experi¬ 
mentally  the  non-inertial  nature  of  a  system  (for  example,  the  rotation  of  the  earth  around  its 
axis). 

A  Coordinate  systems  moving  by  translation 

Theorem.  In  a  coordinate  system  K  which  moves  by  translation  relative  to  an 
inertial  system  k,  the  motion  of  a  mechanical  system  takes  place  as  if  the 
coordinate  system  were  inertial,  but  on  every  point  of  mass  m  an  additional 
“  inertial  force ”  acted :  F  =  —  mr,  where  r  is  the  acceleration  of  the  system  K. 

Proof.  If  Q  =  q  -  r(r),  then  mQ  =  mq  —  mr.  The  effect  of  the  translation  of 
the  coordinate  system  is  reduced  in  this  way  to  the  appearance  of  an  addi¬ 
tional  homogeneous  force  field— mW,  where  W  is  the  acceleration  of  the 
origin.  □ 
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Figure  107  Overload 

Example  I .  At  the  moment  of  takeoff,  a  rocket  has  acceleration  r  directed  upward  (Figure  107). 
Thus,  the  coordinate  system  K  connected  to  the  rocket  is  not  inertial,  and  an  observer  inside  can 
detect  the  existence  of  a  force  field  m  W  and  measure  the  inertial  force,  for  example,  by  means  of 
weighted  springs.  In  this  case  the  inertial  force  is  called  overload* 

Example  2.  When  jumping  from  a  loft,  a  person  has  acceleration  g,  directed  downwards.  Thus, 
the  sum  of  the  inertial  force  and  the  force  of  gravity  is  equal  to  zero:  weighted  springs  show  that 
the  weight  of  any  object  is  equal  to  zero,  so  such  a  state  is  called  weightlessness.  In  exactly  the 
same  way,  weightlessness  is  observed  in  the  free  ballistic  flight  of  a  satellite  since  the  force  of 
inertia  is  opposite  to  the  gravitational  force  of  the  earth. 

Example  3.  If  the  point  of  suspension  of  a  pendulum  moves  with  acceleration  W(t),  then  the 
pendulum  moves  as  if  the  force  of  gravity  g  were  variable  and  equal  to  g  -  W(r). 

*  Translator’s  note.  The  word  overload  is  the  literal  translation  of  the  Russian  term  peregruzka. 
There  does  not  seem  to  be  an  English  term  for  this  particular  kind  of  inertial  force. 
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B  Rotating  coordinate  systems 

Let  Bt:  K  -»  k  be  a  rotation  of  the  coordinate  system  K  relative  to  the  sta¬ 
tionary  coordinate  system  k.  We  will  denote  by  Q(f)  e  K  the  radius  vector  of 
a  moving  point  in  the  moving  coordinate  system,  and  by  q(t)  =  BtQ(t)ek 
the  radius  vector  in  the  stationary  system.  The  vector  of  angular  velocity  in 
the  moving  coordinate  system  is  denoted,  as  in  Section  26,  by  ft.  We  assume 
that  the  motion  of  the  point  q  in  k  is  subject  to  Newton’s  equation  mq  = 

f(q,  <i) 

Theorem.  Motion  in  a  rotating  coordinate  system  takes  place  as  if  three  addi¬ 
tional  inertial  forces  acted  on  every  moving  point  Q  of  mass  m: 

1.  the  inertial  force  of  rotation:  m[ft,  Q], 

2.  the  Coriolis  force:  2m[Sl,  Q],  and 

3.  the  centrifugal  force:  m[ft,  [ft,  Q]]. 

Thus 

mi)  =  F  —  m[ft,  Q]  -  2m[ft,  Q]  -  m[ft,  [ft,  Q]], 

where 

BF(Q,  Q)  =  f(BQ,  (BQ)  )• 

The  first  of  the  inertial  forces  is  observed  only  in  nonuniform  rotation. 
The  second  and  third  are  present  even  in  uniform  rotation. 

n 


Figure  108  Centrifugal  force  of  inertia 


The  centrifugal  force  (Figure  108)  is  always  directed  outward  from  the 
instantaneous  axis  of  rotation  ft;  it  has  magnitude  j ft |2r,  where  r  is  the 
distance  to  this  axis.  This  force  does  not  depend  on  the  velocity  of  the  relative 
motion,  and  acts  even  on  a  body  at  rest  in  the  coordinate  system  K. 

The  Coriolis  force  depends  on  the  velocity  Q.  In  the  northern  hemisphere 
of  the  earth  it  deflects  every  body  moving  along  the  earth  to  the  right,  and 
every  falling  body  eastward. 
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Proof  of  the  theorem.  We  notice  that  for  any  vector  XeK  we  have 
BX  =  B[ft,  X],  In  fact,  by  Section  26,  BX  =  [to,  x]  =  [Bft,  BX].  This  is 
equal  to  B[ft,  X]  since  the  operator  B  preserves  the  metric  and  orientation, 
and  therefore  the  vector  product. 

Since  q  =  BQ  we  see  that  q  —  BQ  +  BQ  =  B(Q  +  [H,  Q]).  Differenti¬ 
ating  once  more,  we  obtain 

q  =  &(Q  4-  [«,  Q])  +  B(Q  +  [ft,  Q]  +  [ft,  Q]) 

=  Btca  (Q  +  ca  QM  +  q  +  ca  q]  +  [a  qd 
=  b(q  +  2ta  Q]  +  ca  [a  Q]]  +  ca  q]).  □ 

(We  again  used  the  relationship  &X  =  B[a  X];  this  time  X  =  Q  + 

[a  Q].) 

We  will  consider  in  more  detail  the  effect  of  the  earth’s  rotation  on  laboratory  experiments. 
Since  the  earth  rotates  practically  uniformly,  we  can  take  12  =  0.  The  centrifugal  force  has  its 
largest  value  at  the  equator,  where  it  attains  &plg  *  (7.3  x  10“5)2  -6.4  x  106/9.8  %  3/1000 
the  weight.  Within  the  limits  of  a  laboratory  it  changes  little,  so  to  observe  it  one  must  travel 
some  distance.  Thus,  within  the  limits  of  a  laboratory  the  rotation  of  the  earth  appears  only  in 
the  form  of  the  Coriolis  force:  in  the  coordinate  system  Q  associated  to  the  earth,  we  have,  with 
good  accuracy, 

d  . 

—  mQ  =  mg  +  2m[Q,  12] 
dt 

(the  centrifugal  force  is  taken  into  account  in  g). 

Example  I,  A  stone  is  thrown  (without  initial  velocity)  into  a  250  m  deep  mine  shaft  at  the 
latitude  of  Leningrad.  How  far  does  it  deviate  from  the  vertical? 

We  solve  the  equation 

0  =  g  +  2[Q,  12] 

by  the  following  approach,  taking  12  1.  We  set  (Figure  109) 

Q  =  Qi  +  Q2. 

where  Q2(0)  =  Q2(0)  =  Oand  Q,  =  Q,(0)  +  gr2/2.  ForQ2,  we  then  get 

Q2  =  2[gf,  12]  +  0(122)  Q2  a  j  [g,  12]  *  j  [h,  12]  h  =  ^ 


Figure  109 


Displacement  of  a  falling  stone  by  Coriolis  force 
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From  this  it  is  apparent  that  the  stone  lands  about 

j  | h |  { 12 1  cos  A  ^  ~-250-7-  l(T5-^n  4cm 

to  the  east. 

Problem.  By  how  much  would  the  Coriolis  force  displace  a  missile  fired  vertically  upwards  at 
Leningrad  from  falling  back  onto  its  launching  pad,  if  the  missile  rose  1  kilometer? 

Example  2  (The  Foucault  pendulum).  Consider  small  oscillations  of  an  ideal  pendulum,  taking 
into  account  the  Coriolis  force.  Let  ex,  ev,  and  e,  be  the  axes  of  a  coordinate  system  associated 
to  the  earth,  with  e,  directed  upwards,  and  e*  and  e,  in  the  horizontal  plane  (Figure  110).  In 


-V 


Figure  1 10  Coordinate  system  for  studying  the  motion  of  a  Foucault  pendulum 

the  approximation  of  small  oscillations,  z  =  0  (in  comparison  with  x  and  >■);  therefore,  the 
horizontal  component  of  the  Coriolis  force  will  be  2 my£lzex  —  2mxQ:ey.  From  this  we  get  the 
equations  of  motion 

fx  =  -m2x  -I-  2yilz,  (fi*  =  |fl|  sin  A0,  where  A0  is  the  latitude) 

(j  -  -« 2y  -  2xQ2, 

If  we  set  x  +  iy  =  w,  then  w  =  x  +  iy,  w  =  x  +  iy,  and  the  two  equations  reduce  to  one 
complex  equation 

w  +  i2nzvi'  +  (u2w  =  0. 

We  solve  it:  a  =  e X2  +  2ifi.A  -f  a>2  =  0,  A  =  -iO.  ±  iv  Ll2  +  to2.  But  £2?  w2.  Therefore, 

V  Q2  +  to1  =  v)  +  0(05).  from  which  it  follows,  by  disregarding  Of,  that 

A  ~  —  iO.  +  ico 

or,  to  the  same  accuracy, 

,V  =  e -«■'(£•  le“,n  +  c2  £>-""). 

For  O.  =  0  we  get  the  usual  harmonic  oscillations  of  a  spherical  pendulum.  We  see  that  the 
effect  of  the  Coriolis  force  reduces  to  a  rotation  of  the  whole  picture  with  angular  velocity  —  O. , 
where  |Oz|  =  |0|  sin  A0 . 

In  particular,  if  the  initial  conditions  correspond  to  a  planar  motion  (y(0)  =  v(0)  =  0),  then 
the  plane  of  oscillation  will  be  rotating  with  angular  velocity  -£2.  with  respect  to  the  earth’s 
coordinate  system  (Figure  111)- 

At  a  pole,  the  plane  of  oscillation  makes  one  turn  in  a  twenty-four-hour  day  (and  is  fixed 
with  respect  to  a  coordinate  system  not  rotating  with  the  earth).  At  the  latitude  of  Moscow  (56  ) 
the  plane  of  oscillation  turns  0.83  of  a  rotation  in  a  twenty-four-hour  day,  i.e.,  12. 5C  in  an  hour. 
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Figure  1 1 1  Trajectory  of  a  Foucault  pendulum 

Problem.  A  river  flows  with  velocity  3  km/hr.  For  what  radius  of  curvature  of  a  river  bend  is  the 
Coriolis  force  from  the  earth’s  rotation  greater  than  the  centrifugal  force  determined  by  the  flow 
of  the  river? 

Answer.  The  radius  of  curvature  must  be  least  on  the  order  of  10  km  for  a  river  of  medium 
width. 

The  solution  of  this  problem  explains  why  a  large  river  in  the  northern  hemisphere  (for 
example,  the  Volga  in  the  middle  of  its  course),  undermines  the  base  of  its  right  bank,  while  a 
river  like  the  Moscow  River,  with  its  abrupt  bends  of  small  radius,  undermines  either  the  left  or 
right  (whichever  is  outward  from  the  bend)  bank. 


28  Rigid  bodies 

In  this  paragraph  we  define  a  rigid  body  and  its  inertia  tensor,  inertia  ellipsoid,  moments  of 
inertia,  and  axes  of  inertia. 

A  The  configuration  manifold  of  a  rigid  body 

Definition.  A  rigid  body  is  a  system  of  point  masses,  constrained  by  holonomic 
relations  expressed  by  the  fact  that  the  distance  between  points  is  constant : 

(1)  |xf  —  Xj-|  =  ru  =  const. 

Theorem.  The  configuration  manifold  of  a  rigid  body  is  a  six-dimensional 
manifold,  namely,  R3  x  SO( 3)  ( the  direct  product  of  a  three-dimensional 
space  R3  and  the  group  S0(3)  of  its  rotations ),  as  long  as  there  are  three 
points  in  the  body  not  in  a  straight  line. 

Proof.  Let  xl5  x2,  and  x3  be  three  points  of  the  body  which  do  not  lie  in  a 
straight  line.  Consider  the  right-handed  orthonormal  frame  whose  first 
vector  is  in  the  direction  of  x2  —  Xj,  and  whose  second  is  on  the  x3  side  in  the 
x,x2x3-plane  (Figure  112).  It  follows  from  the  conditions  |xf  —  Xj|  =  r0 
(i  =  1,  2,  3),  that  the  positions  of  all  the  points  of  the  body  are  uniquely 
determined  by  the  positions  of  xl5  x2 ,  and  x3 ,  which  are  given  by  the  position 
of  the  frame.  Finally,  the  space  of  frames  in  R3  is  R3  x  SO( 3),  since  every 
frame  is  obtained  from  a  fixed  one  by  a  rotation  and  a  translation.48  □ 

48  Strictly  speaking,  the  configuration  space  of  a  rigid  body  is  R3  x  0(3),  and  R3  x  S0(3)  is 
only  one  of  the  two  connected  components  of  this  manifold,  corresponding  to  the  orientation  of 
the  body. 
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<■2 


Figure  112  Configuration  manifold  of  a  rigid  body 


Problem.  Find  the  configuration  space  of  a  rigid  body,  all  of  whose  points  lie  on  a  line. 


Answer.  !R3  x  S1. 

Definition.  A  rigid  body  with  a  fixed  point  0  is  a  system  of  point  masses  con¬ 
strained  by  the  condition  \l  —  O  in  addition  to  conditions  (1). 

Clearly,  its  configuration  manifold  is  the  three-dimensional  rotation 
group  50(3). 

B  Conservation  laws 

Consider  the  problem  of  the  motion  of  a  free  rigid  body  under  its  own  inertia, 
outside  of  any  force  field.  For  an  (approximate)  example  we  can  use  the 
rolling  of  a  spaceship. 

The  system  admits  all  translational  displacements:  they  do  not  change 
the  lagrangian  function.  By  Noether’s  theorem  there  exist  three  first  integrals : 
the  three  components  of  the  vector  of  momentum.  Therefore,  we  have  shown 

Theorem.  Under  the  free  motion  of  a  rigid  body,  its  center  of  mass  moves 
uniformly  and  linearly. 

Now  we  can  look  at  an  inertial  coordinate  system  in  which  the  center  of 
inertia  is  stationary.  Then  we  have 

Corollary.  A  free  rigid  body  rotates  about  its  center  of  mass  as  if  the  center  of 
mass  were  fixed  at  a  stationary  point  0. 

In  this  way,  the  problem  is  reduced  to  the  problem,  with  three  degrees  of 
freedom,  of  the  motion  of  a  rigid  body  around  a  fixed  point  0.  We  will  study 
this  problem  in  more  detail  (not  necessarily  assuming  that  O  is  the  center  of 
mass  of  the  body). 

The  lagrangian  function  admits  all  rotations  around  O.  By  Noether’s 
theorem  there  exist  three  corresponding  first  integrals:  the  three  components 
of  the  vector  of  angular  momentum.  The  total  energy  of  the  system,  E  =  T, 
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is  also  conserved  (here  it  is  equal  to  the  kinetic  energy).  Therefore,  we  have 
shown 

Theorem.  In  the  problem  of  the  motion  of  a  rigid  body  around  a  stationary  point 
0,  in  the  absence  of  outside  forces,  there  are  four  first  integrals:  Mx,  My, 
Mz,  and  E. 

From  this  theorem  we  can  get  qualitative  conclusions  about  the  motion 
without  any  calculation. 

The  position  and  velocity  of  the  body  are  determined  by  a  point  in  the 
six-dimensional  manifold  TSO(3)— the  tangent  bundle  of  the  configuration 
manifold  SO( 3).  The  first  integrals  Mx,My,Mz,  and  E  are  four  functions  on 
TSO{ 3).  One  can  verify  that  in  the  general  case  (if  the  body  does  not  have  any 
particular  symmetry)  these  four  functions  are  independent.  Therefore,  the 
four  equations 

MX  =  C j  My  =  C2  Mz  =  C3  E  =  C4  >  0 

define  a  two-dimensional  submanifold  Vc  in  the  six-dimensional  manifold 
TSO(  3). 

This  manifold  is  invariant:  if  the  initial  conditions  of  motion  give  a  point 
on  Vc,  then  for  all  time  of  the  motion,  the  point  in  TSO( 3)  corresponding  to 
the  position  and  velocity  of  the  body  remains  in  Vc. 

Therefore,  Vc  admits  a  tangent  vector  field  (namely,  the  field  of  velocities 
of  the  motion  on  TSO( 3));  for  C4  >  0  this  field  cannot  have  singular  points. 
Furthermore,  it  is  easy  to  verify  that  Vc  is  compact  (using  E)  and  orientable 
(since  TSO{ 3)  is  orientable).49 

In  topology  it  is  proved  that  the  only  connected  orientable  compact  two- 
dimensional  manifolds  are  the  spheres  with  n  handles,  n  >  0  (Figure  113). 
Of  these,  only  the  torus  (n  =  1)  admits  a  tangent  vector  field  without  singular 
points.  Therefore,  the  invariant  manifold  Vc  is  a  two-dimensional  torus  (or 
several  tori). 

We  will  see  later  that  one  can  choose  angular  coordinates  <pu  (p2 ,  (mod  2n) 
on  this  torus  such  that  a  motion  represented  by  a  point  of  K  is  given  by  the 
equations  <p,  =  aq(c),  q>2  =  <&2(c)- 

49  The  following  assertions  are  easy  to  prove: 

1.  Let  -*  IR  be  functions  on  an  oriented  manifold  M.  Consider  the  set  V  given  by 

the  equations  /,  =  clt . .  .,fk  =  ck.  Assume  that  the  gradients  of  are  linearly 

independent  at  each  point.  Then  V  is  orientable. 

2.  The  direct  product  of  orientable  manifolds  is  orientable. 

3.  The  tangent  bundle  TSO{ 3)  is  the  direct  product  R3  x  50(3).  A  manifold  whose  tangent 
bundle  is  a  direct  product  is  called  parallelizable.  The  group  S0(3)  (like  every  Lie  group)  is 
parallelizable. 

4.  A  parallelizable  manifold  is  orientable. 

It  follows  from  assertions  14  that  50(3),  T50( 3),  and  Vc  are  orientable. 
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Figure  113  Two-dimensional  compact  connected  orientable  manifolds 


In  other  words,  a  rotation  of  a  rigid  body  is  represented  by  the  super¬ 
position  of  two  periodic  motions  with  (usually)  different  periods:  if  the 
frequencies  c Oj  and  &>2  are  non-commensurable,  then  the  body  never  returns 
to  its  original  state  of  motion.  The  magnitudes  of  the  frequencies  to1  and  co2 
depend  on  the  initial  conditions  C. 

C  The  inertia  operator50 

We  now  go  on  to  the  quantitative  theory  and  introduce  the  following 
notation.  Let  k  be  a  stationary  coordinate  system  and  K  a  coordinate  system 
rotating  together  with  the  body  around  the  point  0:  in  K  the  body  is  at  rest. 


Figure  1 14  Radius  vector  and  vectors  of  velocity,  angular  velocity  and  angular 
momentum  of  a  point  of  the  body  in  space 

Every  vector  in  K  is  carried  over  to  k  by  an  operator  B.  Corresponding 
vectors  in  K  and  k  will  be  denoted  by  the  same  letter ;  capital  for  K  and  lower 
case  for  k.  So,  for  example  (Figure  114), 

q  g  k  is  the  radius  vector  of  a  point  in  space; 

Q  e  K  is  its  radius  vector  in  the  body,  q  =  BQ; 
v  =  q  g  k  is  the  velocity  vector  of  a  point  in  space; 

V  e  K  is  the  same  vector  in  the  body,  v  =  BV; 
to  e  k  is  the  angular  velocity  in  space; 

Sle  K  is  the  angular  velocity  in  the  body,  to  =  BSl ; 
m  e  k  is  the  angular  momentum  in  space; 

M  e  K  is  the  angular  momentum  in  the  body,  m  =  BM. 

Since  the  operator  B:K  k  preserves  the  metric  and  orientation,  it 
preserves  the  scalar  and  vector  products. 

50  Often  called  the  inertia  tensor  (translator’s  note). 
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By  definition 


By  definition 
to  0, 


of  angular  velocity  (Section  26), 
v  -  [to,  q]. 

of  the  angular  momentum  of  a  point  of  mass  m  with  respect 


m  =  [q,  mv]  =  m[q,  [to,  q]]. 


Therefore, 


M  =  m[Q,  [ft,  Q]]. 


Hence,  there  is  a  linear  operator  transforming  to  M : 

A:K^K  ASl  =  M. 

This  operator  still  depends  on  a  point  of  the  body  (Q)  and  its  mass  (m). 


Lemma.  The  operator  A  is  symmetric. 

Proof.  In  view  of  the  relation  ([a,  6],  c)  -  ([c,  a],  b )  we  have,  for  any  X  and 
Y  in  K, 


(4X,  Y)  =  m([Q,  [X,  Q]],  Y)  =  m([Y,  Q],  [X,  Q]), 

and  the  last  expression  is  symmetric  in  X  and  Y.  □ 

By  substituting  the  vector  of  angular  velocity  ft  for  X  and  Y  and  noticing 
that  [ft,  Q]2  =  V2  =  v2,  we  obtain 

Corollary.  The  kinetic  energy  of  a  point  of  a  body  is  a  quadratic  form  with 
respect  to  the  vector  of  angular  velocity  ft,  namely: 

T  =  &4ft,  ft)  =  KM,  ft). 

The  symmetric  operator  A  is  called  the  inertia  operator  (or  tensor)  of  the 
point  Q. 

If  a  body  consists  of  many  points  Q(  with  masses  m,,  then  by  summing  we 
obtain 

Theorem.  The  angular  momentum  M  of  a  rigid  body  with  respect  to  a  stationary 
point  0  depends  linearly  on  the  angular  velocity  ft,  i.e.,  there  exists  a  linear 
operator  A:K  ->  K,  Ail  =  M.  The  operator  A  is  symmetric. 

The  kinetic  energy  of  a  body  is  a  quadratic  form  with  respect  to  the  angular 
velocity  ft, 

t  =  ^/tn,  ti)  =  Jew,  n). 

Proof.  By  definition,  the  angular  momentum  of  a  body  is  equal  to  the  sum 
of  the  angular  momenta  of  its  points: 

M  =  ^  Mf  =  Y,  =  Ail,  where  A  =  £  A,-. 

i  i  i 
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Since  by  the  lemma  the  inertia  operator  A,  of  every  point  is  symmetric, 
the  operator  A  is  also  symmetric.  For  kinetic  energy  we  obtain,  by  definition, 

r  -  I  r,  =  IKMi, n)  -  KM, ft)  -  i(xn, n).  □ 

i  i 

D  Principal  axes 

Like  every  symmetric  operator,  A  has  three  mutually  orthogonal  char¬ 
acteristic  directions.  Let  elt  e2,  and  e3  e  K  be  their  unit  vectors  and  IlfI2, 
and  / 3  their  eigenvalues.  In  the  basis  ef,  the  inertia  operator  and  the  kinetic 
energy  have  a  particularly  simple  form : 

Mi  =  'A 

T  =  ^{/Al  +  1 2^2  +  ^3^3)* 

The  axes  et  are  called  the  principal  axes  of  the  body  at  the  point  0. 

Finally,  if  the  numbers  Iu  I2,  and  /3  are  not  all  different,  then  the  axes 
are  not  uniquely  defined.  We  will  further  clarify  the  meaning  of  the  eigen¬ 
values  7 1, 1 2,  and  /3. 

Theorem.  For  a  rotation  of  a  rigid  body  fixed  at  a  point  O,  with  angular  velocity 
12  =  fte  (Q  =  1 11 1 )  around  the  e  axis,  the  kinetic  energy  is  equal  to 

T  =  jItQ2,  where  Ie  =  £  mtrf 

i 

and  r,  is  the  distance  of  the  t-th  point  to  the  e  axis  (Figure  115). 


12  =  12e 


Figure  115  Kinetic  energy  of  a  body  rotating  around  an  axis 
Proof.  By  definition  T  =  \  Y  m^f ;  but  |vf  |  =  Or,-,  so  T  =  m,rf)D2. 

□ 

The  number  /e  depends  on  the  direction  e  of  the  axis  of  rotation  ft  in  the 
body. 

Definition.  /„  is  called  the  moment  of  inertia  of  the  body  with  respect  to  the 
e  axis : 

=  l^rf. 
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By  comparing  the  two  expressions  for  T  we  obtain: 

Corollary.  The  eigenvalues  It  of  the  inertia  operator  A  are  the  moments  of 
inertia  of  the  body  with  respect  to  the  principal  axes  ef. 

E  The  inertia  ellipsoid 

In  order  to  study  the  dependence  of  the  moment  of  inertia  /e  upon  the  direc¬ 
tion  of  the  axis  e  in  a  body,  we  consider  the  vectors  e/y/7e,  where  the  unit 
vector  e  runs  over  the  unit  sphere. 

Theorem.  The  vectors  form  an  ellipsoid  in  K. 

Proof.  If  ft  =  e/^/7^,  then  the  quadratic  form  T  —  ^(/Ift,  ft)  is  equal  to 
Therefore,  {11}  is  the  level  set  of  a  positive  definite  quadratic  form,  i.e.,  an 
ellipsoid.  □ 

One  could  say  that  this  ellipsoid  consists  of  those  angular  velocity  vectors 
ft  whose  kinetic  energy  is  equal  to  j. 

Definition.  The  ellipsoid  {£1:  (,4ft,  ft)  =  1}  is  called  the  inertia  ellipsoid  of  the 
body  at  the  point  0  (Figure  116). 


Figure  116  Ellipsoid  of  inertia 


In  terms  of  the  principal  axes  e;,  the  equation  of  the  inertia  ellipsoid  has 
the  form 

/jft?  +  l2£l\  +  /3ftf  =  1. 

Therefore  the  principal  axes  of  the  inertia  ellipsoid  are  directed  along  the 
principal  axes  of  the  inertia  tensor,  and  their  lengths  are  inversely  proportional 
to  JYi- 

Remark.  If  a  body  is  stretched  out  along  some  axis,  then  the  moment  of 
inertia  with  respect  to  this  axis  is  small,  and  consequently,  the  inertia  el¬ 
lipsoid  is  also  stretched  out  along  this  axis;  thus,  the  inertia  ellipsoid  may 
resemble  the  shape  of  the  body. 

If  a  body  has  an  axis  of  symmetry  of  order  k  passing  through  0  (so  that  it 
coincides  with  itself  after  rotation  by  2n/k  around  the  axis),  then  the  inertia 
ellipsoid  also  has  the  same  symmetry  with  respect  to  this  axis.  But  a  triaxial 
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ellipsoid  does  not  have  axes  of  symmetry  of  order  k  >  2.  Therefore,  every  axis 
of  symmetry  of  a  body  of  order  k  >  2  is  an  axis  of  rotation  of  the  inertia 
ellipsoid  and,  therefore,  a  principal  axis. 

Example.  The  inertia  ellipsoid  of  three  points  of  mass  m  at  the  vertices  of  an  equilateral  triangle 
with  center  O  is  an  ellipsoid  of  revolution  around  an  axis  normal  to  the  plane  of  the  triangle 
(Figure  1 17). 


Figure  1 17  Ellipsoid  of  inertia  of  an  equilateral  triangle 

If  there  are  several  such  axes,  then  the  inertia  ellipsoid  is  a  sphere,  and  any 
axis  is  principal. 

Problem.  Draw  the  line  through  the  center  of  a  cube  such  that  the  sum  of  the  squares  of  its 
distances  from  the  vertices  of  the  cube  is:  (a)  largest,  (b)  smallest. 

We  now  remark  that  the  inertia  ellipsoid  (or  the  inertia  operator  or  the 
moments  of  inertia  /„  I2,  and  /3)  completely  determines  the  rotational 
characteristics  of  our  body:  if  we  consider  two  bodies  with  identical  inertia 
ellipsoids,  then  for  identical  initial  conditions  they  will  move  identically  (since 

they  have  the  same  lagrangian  function  L=T). 

Therefore,  from  the  point  of  view  of  the  dynamics  of  rotation  around  0, 
the  space  of  all  rigid  bodies  is  three-dimensional,  however  many  points  com¬ 
pose  the  body.  „ 

We  can  even  consider  the  “solid  rigid  body  of  density  p(Q),”  having  in 
mind  the  limit  as  AQ  -*■  0  of  the  sequence  of  bodies  with  a  finite  number  of 
points  Qj  with  masses  pfQ^AQ,  (Figure  118)  or,  what  amounts  to  the  same 
thing,  any  body  with  moments  of  inertia 

/«  =  j]>(QV2(Q«, 

where  r  is  the  distance  from  Q  to  the  e  axis. 
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Example.  Find  the  principal  axes  and  moments  of  inertia  of  the  uniform  planar  plate  |x|  <  a, 
|y|  <  b,  z  =  0  with  respect  to  0. 

Solution.  Since  the  plate  has  three  planes  of  symmetry,  the  inertia  ellipsoid  has  the  same  planes 
of  symmetry  and,  therefore,  principal  axes  x,  y,  and  z.  Furthermore, 

Ca  Cb  ,  ma2 

I)  =  J  J  x  pdxdy  =  — . 

In  the  same  way 


Clearly,  I2  =  Ix  +  /, 

Problem.  Show  that  the  moments  of  inertia  of  any  body  satisfy  the  triangle  inequalities 
I3  <  1 2  +  / j  f 2  —  f  1  C  f 3  and  1 1  <  1 2  +  1 3, 

and  that  equality  holds  only  for  a  planar  body. 

Problem.  Find  the  axes  and  moments  of  inertia  of  a  homogeneous  ellipsoid  of  mass  m  with 
semiaxes  a,  b,  and  c  relative  to  the  center  0. 

Hint.  First  look  at  the  sphere. 

Problem.  Prove  Steiner’s  theorem:  The  moments  of  inertia  of  any  rigid  body 
relative  to  two  parallel  axes,  one  of  which  passes  through  the  center  of  mass, 
are  related  by  the  equation 

I  =  /0  4-  mr2, 

where  m  is  the  mass  of  the  body,  r  is  the  distance  between  the  axes,  and  I0 
is  the  moment  of  inertia  relative  to  the  axis  passing  through  the  center  of 
mass. 

Thus  the  moment  of  inertia  relative  to  an  axis  passing  through  the  center 
of  mass  is  less  than  the  moment  of  inertia  relative  to  any  parallel  axis. 

Problem.  Find  the  principal  axes  and  moments  of  inertia  of  a  uniform  tetrahedron  relative  to 
its  vertices. 

Problem.  Draw  the  angular  momentum  vector  M  for  a  body  with  a  given  inertia  ellipsoid 
rotating  with  a  given  angular  velocity  Cl. 

Answer.  M  is  in  the  direction  normal  to  the  inertia  ellipsoid  at  a  point  on  the  11  axis  (Figure  1 19). 


Cl 


Figure  119  Angular  velocity,  ellipsoid  of  inertia  and  angular  momentum 
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Figure  120  Behavior  of  moments  of  inertia  as  the  body  becomes  smaller 

Problem.  A  piece  is  cut  off  a  rigid  body  fixed  at  the  stationary  point  O.  How  are  the  principal 
moments  of  inertia  changed?  (Figure  120). 

Answer.  All  three  principal  moments  are  decreased. 

Hint.  Cf.  Section  24. 

Problem.  A  small  mass  e  is  added  to  a  rigid  body  with  moments  of  inertia  It  >  I2  >  h  at  the 
point  Q  =  +  x2e2  +  x3e3.  Find  the  change  in  /,  and  e,  with  error  0(e2). 

Solution.  The  center  of  mass  is  displaced  by  a  distance  of  order  c.  Therefore,  the  moments  of 
inertia  of  the  old  body  with  respect  to  the  parallel  axes  passing  through  the  old  and  new  centers 
of  mass  differ  in  magnitude  of  order  e2.  At  the  same  time,  the  addition  of  mass  changes  the 
moment  of  inertia  relative  to  any  fixed  axis  by  order  e.  Therefore,  we  can  disregard  the  displace¬ 
ment  of  the  center  of  mass  for  calculations  with  error  0(e2). 

Thus,  after  addition  of  a  small  mass  the  kinetic  energy  takes  the  form 

T=  T0  +  Q]2  +  0(£2), 

where  T0  =  ^{/ , Of  +  l2Cl\  -f  /30|)  is  the  kinetic  energy  of  the  original  body.  We  look  for  the 
eigenvalue  /,(e)  and  eigenvector  e,(fi)  of  the  inertia  operator  in  the  form  of  a  Taylor  series  in  e. 
By  equating  coefficients  of  e  in  the  relation  =  / ,  (e)e,(£),  we  find  that,  within  error 

0(e2): 

TT^Ti62  +  h  -  1 1  / 

From  the  formula  for  It(s)  it  is  clear  that  the  change  in  the  principal  moments  of  inertia  (to  the 
first  approximation  in  e)  is  as  if  neither  the  center  of  mass  nor  the  principal  axes  changed.  The 
formula  for  e^e)  demonstrates  how  the  directions  of  the  principal  axes  change:  the  largest 
principal  axis  of  the  inertia  ellipsoid  approaches  the  added  point,  and  the  smallest  recedes  from 
it.  Furthermore,  the  addition  of  a  small  mass  on  one  of  the  principal  planes  of  the  inertia 
ellipsoid  rotates  the  two  axes  lying  in  this  plane  and  does  not  change  the  direction  of  the  third 
axis.  The  appearance  of  the  differences  of  moments  of  inertia  in  the  denominator  is  connected 
with  the  fact  that  the  major  axes  of  an  ellipsoid  of  revolution  are  not  defined.  If  the  inertia 
ellipsoid  is  nearly  an  ellipsoid  of  revolution  (i.e.,  I,  ~  12) then  the  addition  of  a  small  mass  could 
strongly  turn  the  axes  et  and  e2  in  the  plane  spanned  by  them. 

29  Euler’s  equations.  Poinsot’s  description  of  the  motion 

Here  we  study  the  motion  of  a  rigid  body  around  a  stationary  point  in  the  absence  of  outside 
forces  and  the  similar  motion  of  a  free  rigid  body.  The  motion  turns  out  to  have  two  frequencies. 

A  Euler  s  equations 

Consider  the  motion  of  a  rigid  body  around  a  stationary  point  0.  Let  M  be 
the  angular  momentum  vector  of  the  body  relative  to  O  in  the  body,  ft  the 
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angular  velocity  vector  in  the  body,  and  A  the  inertia  operator  (/4ft  =  M); 
the  vectors  ft  and  M  belong  to  the  moving  coordinate  system  K  (Section  26). 
The  angular  momentum  vector  of  the  body  relative  to  O  in  space,  m  =  BM, 
is  preserved  under  the  motion  (Section  28B). 

Therefore,  the  vector  M  in  the  body  (M  e  K)  must  move  so  that  m  =  B,M(t) 
does  not  change  when  t  changes. 


Theorem 

(1) 


dM 

~dt 


[M,  ft]. 


Proof.  We  apply  formula  (5),  Section  26  for  the  velocity  of  the  motion  of 
the  “point”  M(t)e^  with  respect  to  the  stationary  space  k.  We  get 

rh  -  BM  +  [co,  m]  =  B(M  +  [ft,  M]). 

But  since  the  angular  momentum  m  with  respect  to  the  space  is  preserved 
(m  =  0),  M  +  [ft,  M]  =  0.  □ 


Relation  (1)  is  called  the  Euler  equations.  Since  M  =  /4ft,  (1)  can  be 
viewed  as  a  differential  equation  for  M  (or  for  ft).  If 

ft  =  ftfC}  +  02e2  +  ft3e3  and  M  =  M ^  +  M2e2  +  M3e3 

are  the  decompositions  of  ft  and  M  with  respect  to  the  principal  axes  at  O, 
then  M,  =  /jft,  and  (1)  becomes  the  system  of  three  equations 


(2) 


dMy 

~ir 


alM2M3 


dM2 

~df 


a2M2Ml 


dM 3 
dt 


a3MlM2, 


where  a  j  =  ( 12  -  I3)/I2h,a2  =  (/3  -  anda3  =  (/t  -  /2)//1/2,or, 

in  the  form  of  a  system  of  three  equations  for  the  three  components  of  the 
angular  velocity, 

1 1  =  (I2  —  /3)ft2ft3, 

h~  =  (h-ii)a3at, 


dQ3 

z~dt 


—  tf  1  ~  1 2)^l  ^2  • 


Remark.  Suppose  that  outside  forces  act  on  the  body,  the  sum  of  whose 
moments  with  respect  to  O  is  equal  to  n  in  the  stationary  coordinate  system 
and  N  in  the  moving  system  (n  =  BN).  Then 

m  =  n 


and  the  Euler  equations  take  the  form 


dM 

dt 


=  [M,  ft]  +  N. 
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B  Solutions  of  the  Euler  equations 

Lemma.  The  Euler  equations  (2)  have  two  quadratic  first  integrals 

2£  =  — +  ™  +  —  and  M2  =  M\  +  M\  +  M\. 

^1  ^2  ^3 

Proof.  E  is  preserved  by  the  law  of  conservation  of  energy,  and  M2  by  the 
law  of  conservation  of  angular  momentum  m,  since  m2  =  M2  =  M2.  □ 

Thus,  M  lies  in  the  intersection  of  an  ellipsoid  and  a  sphere.  In  order  to 
study  the  structure  of  the  curves  of  intersection  we  will  fix  the  ellipsoid 
E  >  0  and  change  the  radius  M  of  the  sphere  (Figure  121). 


e2 


Figure  121  Trajectories  of  Euler’s  equation  on  an  energy  level  surface 

We  assume  that  7t  >  /2  >  /3.  The  semiaxes  of  the  ellipsoid  will  be 
JieF,  >  JieT2  >  yflEIi .  If  the  radius  M  of  the  sphere  is  less  than  the 

smallest  semiaxes  or  larger  than  the  largest  (M  <  or  M  > 

then  the  intersection  is  empty,  and  no  actual  motion  corresponds  to  such 
values  of  E  and  Af.  If  the  radius  of  the  sphere  is  equal  to  the  smallest  semi¬ 
axes,  then  the  intersection  consists  of  two  points.  Increasing  the  radius,  so 
that  y/2EI3  <  M  <  yj2E!2,  we  get  two  curves  around  the  ends  of  the  small¬ 
est  semiaxes.  In  exactly  the  same  way,  if  the  radius  of  the  sphere  is  equal 
to  the  largest  semiaxes  we  get  their  ends,  and  if  it  is  a  little  smaller  we  get 
two  closed  curves  close  to  the  ends  of  the  largest  semiaxes.  Finally,  if 
M  =  JlE\2,  the  intersection  consists  of  two  circles. 

Each  of  the  six  ends  of  the  semiaxes  of  the  ellipsoid  is  a  separate  trajectory 
of  the  Euler  equations  (2)— a  stationary  position  of  the  vector  M.  It  corre¬ 
sponds  to  a  fixed  value  of  the  vector  of  angular  velocity  directed  along  one 
of  the  principal  axes  e,  ;  during  such  a  motion,  ft  remains  collinear  with  M. 
Therefore,  the  vector  of  angular  velocity  retains  its  position  <o  in  space 
collinear  with  m:  the  body  simply  rotates  with  fixed  angular  velocity  around 
the  principal  axis  of  inertia  e,,  which  is  stationary  in  space. 
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Definition.  A  motion  of  a  body,  under  which  its  angular  velocity  remains 
constant  (to  =  const,  ft  =  const)  is  called  a  stationary  rotation. 

We  have  proved: 

Theorem.  A  rigid  body  fixed  at  a  point  O  admits  a  stationary  rotation  around 
any  of  the  three  principal  axes  el5  e2,  and  e3 . 

If,  as  we  assumed,  /j  >  /2  >  /3,  then  the  right-hand  side  of  the  Euler 
equations  does  not  become  0  anywhere  else,  i.e.,  there  are  no  other  stationary 
rotations. 

We  will  now  investigate  the  stability  (in  the  sense  of  Liapunov)  of  solu¬ 
tions  to  the  Euler  equations. 

Theorem.  The  stationary  solutions  M  =  Af1e1  and  M  =  M3e3  of  the  Euler 
equations  corresponding  to  the  largest  and  smallest  principal  axes  are 
stable,  while  the  solution  corresponding  to  the  middle  axis  (M  =  M2e2) 
is  unstable. 

Proof.  For  a  small  deviation  of  the  initial  condition  from  or  M3e3, 
the  trajectory  will  be  a  small  closed  curve,  while  for  a  small  deviation  from 
M2e2  it  will  be  a  large  one.  □ 

Problem.  Are  stationary  rotations  of  the  body  around  the  largest  and  smallest  principal  axes 
Liapunov  stable? 

Answer.  No. 


C  Poinsot's  description  of  the  motion 

It  is  easy  to  visualize  the  motion  of  the  angular  momentum  and  angular 
velocity  vectors  in  a  body  (M  and  ft) — they  are  periodic  if  x/2£/i. 

In  order  to  see  how  a  body  rotates  in  space,  we  look  at  its  inertia  ellipsoid. 

E  =  {ft: (Aft,  ft)  =  1}  cr  K, 

where  A :  ft  ->  M  is  the  symmetric  operator  of  inertia  of  the  body  fixed 
at  0. 

At  every  moment  of  time  the  ellipsoid  E  occupies  a  position  BtE  in  the 
stationary  space  k. 

Theorem  (Poinsot).  The  inertia  ellipsoid  rolls  without  slipping  along  a  station¬ 
ary  plane  perpendicular  to  the  angular  momentum  vector  m  ( Figure  122). 

Proof.  Consider  a  plane  n  perpendicular  to  the  momentum  vector  m  and 
tangent  to  the  inertia  ellipsoid  BtE.  There  are  two  such  planes,  and  at  the 
point  of  tangency  the  normal  to  the  ellipsoid  is  parallel  to  m. 
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m 


Figure  122  Rolling  of  the  ellipsoid  of  inertia  on  the  invariable  plane 

But  the  inertia  ellipsoid  E  has  normal  grad(.4i2, 11)  =  2AQ  =  2M  at  the 
point  fl.  Therefore,  at  the  points  +£  =  to l^/2T  of  the  oa  axis,  the  normal  to 
BtE  is  collinear  with  m. 

So  the  plane  n  is  tangent  to  BtE  at  the  points  +£  on  the  instantaneous 
axis  of  rotation.  But  the  scalar  product  of  £  with  the  stationary  vector  m  is 
equal  to  ±(\jy/2T){m,  to)  =  ±s/l T,  and  is  therefore  constant.  So  the 
distance  of  the  plane  n  from  O  does  not  change,  i.e.,  n  is  stationary. 

Since  the  point  of  tangency  lies  on  the  instantaneous  axis  of  rotation,  its 
velocity  is  equal  to  zero.  This  implies  that  the  ellipsoid  B,E  rolls  without 
slipping  along  n.  □ 

Translators  remark :  The  plane  n  is  sometimes  called  the  invariable  plane. 

Corollary.  Under  initial  conditions  close  to  a  stationary  rotation  around  the 

large  (or  small )  axis  of  inertia,  the  angular  velocity  always  remains  close 

to  its  initial  position,  not  only  in  the  body  (H)  but  also  in  space  (<o). 

We  now  consider  the  trajectory  of  the  point  of  tangency  in  the  stationary 
plane  n.  When  the  point  of  tangency  makes  an  entire  revolution  on  the  ellip¬ 
soid,  the  initial  conditions  are  repeated  except  that  the  body  has  turned 
through  some  angle  a  around  the  m  axis.  The  second  revolution  will  be 
exactly  like  the  first;  if  a  =  2n(p/q),  the  motion  is  completely  periodic;  if 
the  angle  is  not  commensurable  with  2n,  the  body  will  never  return  to  its 
initial  state. 

In  this  case  the  trajectory  of  the  point  of  tangency  is  dense  in  an  annulus 
with  center  O'  in  the  plane  (Figure  123). 

Problem.  Show  that  the  connected  components  of  the  invariant  two- 
dimensional  manifold  Vc  (Section  28B)  in  the  six-dimensional  space  TSO( 3) 
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Figure  123  Trajectory  of  the  point  of  contact  on  the  invariable  plane 

are  tori,  and  that  one  can  choose  coordinates  tpj  and  cp2  mod  2 n  on  them  so 
that  0!  =  Wj (c)  and  <p2  =  <*>2(0). 

Hint.  Take  the  phase  of  the  periodic  variation  of  M  as  (py. 

We  now  look  at  the  important  special  case  when  the  inertia  ellipsoid  is 
an  ellipsoid  of  revolution : 

^2  —  I3  ^  ft- 

In  this  case  the  axis  of  the  ellipsoid  Btelf  the  instantaneous  axis  of  rotation 
to,  and  the  vector  m  always  lie  in  one  plane.  The  angles  between  them  and  the 
length  of  the  vector  to  are  preserved;  the  axes  of  rotation  (to)  and  symmetry 
(BfCy)  sweep  out  cones  around  the  angular  momentum  vector  m  with  the 
same  angular  velocity  (Figure  124).  This  motion  around  m  is  called  pre¬ 
cession. 

Problem.  Find  the  angular  velocity  of  precession. 

Answer.  Decompose  the  angular  velocity  vector  to  into  components  in  the  directions  of  the 
angular  momentum  vector  m  and  the  axis  of  the  body  B,e1.  The  first  component  gives  the  angular 
velocity  of  precession,  a>pr  —  M/I2- 


m 


Figure  124  Rolling  of  an  ellipsoid  of  revolution  on  the  invariable  plane 
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Hint.  Represent  the  motion  of  the  body  as  the  product  of  a  rotation  around  the  axis  of 
momentum  and  a  subsequent  rotation  around  the  axis  of  the  body.  The  sum  of  the  angular 
velocity  vectors  of  these  rotations  is  equal  to  the  angular  velocity  vector  of  the  product. 

Remark.  In  the  absence  of  outside  forces,  a  rigid  body  fixed  at  a  point  0  is  represented  by  a 
lagrangian  system  whose  configuration  space  is  a  group,  namely  S0(3),  and  the  lagrangian 
function  is  invariant  under  left  translations.  One  can  show  that  a  significant  part  of  Euler’s  theory 
of  rigid  body  motion  uses  only  this  property  and  therefore  holds  for  an  arbitrary  left-invariant 
lagrangian  system  on  an  arbitrary  Lie  group.  In  particular,  by  applying  this  theory  to  the  group 
of  volume-preserving  diffeomorphisms  of  a  domain  D  in  a  riemannian  manifold,  one  can  obtain 
the  basic  theorems  of  the  hydrodynamics  of  an  ideal  fluid.  (See  Appendix  2.) 


30  Lagrange’s  top 

We  consider  here  the  motion  of  an  axially  symmetric  rigid  body  fixed  at  a  stationary  point  in  a 
uniform  force  field.  This  motion  is  composed  of  three  periodic  processes:  rotation,  precession, 
and  nutation. 

A  Euler  angles 

Consider  a  rigid  body  fixed  at  a  stationary  point  O  and  subject  to  the  action 
of  the  gravitational  force  mg.  The  problem  of  the  motion  of  such  a  “heavy 
rigid  body”  has  not  yet  been  solved  in  the  general  case  and  in  some  sense  is 
unsolvable. 

In  this  problem  with  three  degrees  of  freedom,  only  two  first  integrals 
are  known:  the  total  energy  E  =  T  +  U,  and  the  projection  Mz  of  the 
angular  momentum  on  the  vertical.  There  is  an  important  special  case  in 
which  the  problem  can  be  completely  solved— the  case  of  a  symmetric  top.  A 
symmetric  or  lagrangian  top  is  a  rigid  body  fixed  at  a  stationary  point  0 
whose  inertia  ellipsoid  at  O  is  an  ellipsoid  of  revolution  and  whose  center  of 
gravity  lies  on  the  axis  of  symmetry  e3  (Figure  125).  In  this  case,  a  rotation 


Figure  125  Lagrangian  top 


around  the  e3  axis  does  not  change  the  lagrangian  function,  and  by  Noether’s 
theorem  there  must  exist  a  first  integral  in  addition  to  E  and  Mz  (as  we  will 
see,  it  turns  out  to  be  the  projection  M3  of  the  angular  momentum  vector  on 
the  e3  axis). 

If  we  can  introduce  three  coordinates  so  that  the  angles  of  rotation  around 
the  z  axis  and  around  the  axis  of  the  top  are  among  them,  then  these  co- 
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ordinates  will  be  cyclic,  and  the  problem  with  three  degrees  of  freedom  will 
reduce  to  a  problem  with  one  degree  of  freedom  (for  the  third  coordinate). 

Such  a  choice  of  coordinates  on  the  configuration  space  S0(3)  is  possible; 
these  coordinates  (p,  i j/,  9  are  called  the  Euler  angles  and  form  a  local  co¬ 
ordinate  system  in  SO(3)  similar  to  geographical  coordinates  on  the  sphere: 
they  exclude  the  poles  and  are  multiple-valued  on  one  meridian. 


Figure  126  Euler  angles 


We  introduce  the  following  notation  (Figure  126): 

e*,  ej,,  and  ez  are  the  unit  vectors  of  a  right-handed  cartesian  stationary 
coordinate  system  at  the  stationary  point  O; 
el9  e2,  and  e3  are  the  unit  vectors  of  a  right  moving  coordinate  system 
connected  to  the  body,  directed  along  the  principal  axes  at  0; 
I i  =  1 2  ^  1 3  are  the  moments  of  inertia  of  the  body  at  O; 
eN  is  the  unit  vector  of  the  axis  [e. ,  e3],  called  the  “line  of  nodes” 

(all  vectors  are  in  the  “stationary  space”  k). 

In  order  to  carry  the  stationary  frame  (e*,  ey,  ez)  into  the  moving  frame 
(e1?  e2 ,  e3X  we  must  perform  three  rotations: 

1.  Through  an  angle  (p  around  the  ez  axis.  Under  this  rotation,  ez  remains 
fixed,  and  ex  goes  to  eN . 

2.  Through  an  angle  9  around  the  eN  axis.  Under  this  rotation,  ez  goes  to 
e3,  and  eN  remains  fixed. 

3.  Through  an  angle  i j/  around  the  e3  axis.  Under  this  rotation,  eN  goes  to 
el5  and  e3  stays  fixed. 

After  all  three  rotations,  ex  has  gone  to  ex,  and  e2  to  e3;  therefore,  ey 
goes  to  e2 . 
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The  angles  tp,  i j/,  and  9  are  called  the  Euler  angles.  It  is  easy  to  prove : 

Theorem.  To  every  triple  of  numbers  (p,  9 ,  ij/  the  construction  above  associates 
a  rotation  of  three-dimensional  space,  B(tp,  9,  i j/)  £  SO(3),  taking  the 
frame  (ex,  ey,  ez)  into  the  frame  (elt  e2,e3).  In  addition,  the- mapping 
(cp,  9,  i)/)  -*■  B{tp,  9,  \}/)  gives  local  coordinates 

0  <  (p  <  2n  0  <  ij/  <2n  0  <  9  <  n 

on  50(3),  the  configuration  space  of  the  top.  Like  geographical  longitude, 
(p  and  \]/  can  be  considered  as  angles  mod  271 ;  for  9  —  0  or  9  =  n  the  map 
(tp,  9,\j/)-^>  B  has  a  pole-type  singularity. 


B  Calculation  of  the  lagrangian  function 

We  will  express  the  lagrangian  function  in  terms  of  the  coordinates  tp,  9,  i ji 
and  their  derivatives. 

The  potential  energy,  clearly,  is  equal  to 


U  = 


j*  J  j zg  dm  =  mgz0  =  mgl  cos  9, 


where  z0  is  the  height  of  the  center  of  gravity  above  0  (Figure  125). 

We  now  calculate  the  kinetic  energy.  A  small  trick  is  useful  here:  we 
consider  the  particular  case  when  (p  =  \f/  =  0. 


Lemma.  The  angular  velocity  of  a  top  is  expressed  in  terms  of  the  derivatives 
of  the  Euler  angles  by  the  formula 

to  =  +  (( p  sin  0)e2  +  (f/  +  tp  cos  9)e3, 

if  (p  =  \p  =  0. 

Proof.  We  look  at  the  velocity  of  a  point  of  the  top  occupying  the  position 
r  at  time  t.  After  time  dt  this  point  takes  the  position  (within  ( dt )2) 

B(tp  +  dtp,  9  +  d9,  i jj  +  dil/)B~  \(p,  9,  ij/)r, 

where  dtp  =  <p  dt,  d9  —  9  dt  and  d\j/  =  dt. 

Consequently,  to  the  same  accuracy  the  displacement  vector  is  the  sum 
of  the  three  terms 

B(tp  +  dtp,  9,  i j/)B~  ftp,  9,  iA)r  —  r  =  [o)„,  r]dt, 

B(tp,  9  +  d9,  i/^)B_1((p,  9,  \J/)r  —  r  =  [w0,  r ~]dt, 

B(tp,  9,ij/  +  #)B“  ftp,  9,  i/Or  —  r  =  [o^,  r~\dt 

(the  angular  velocities  co,,,  coe,  and  ta#  are  defined  by  these  formulas). 

Therefore,  the  velocity  of  the  point  r  is  v  =  [co^  +  tae  +  co^,  r],  so  the 
angular  velocity  of  the  body  is 

to  —  +  o)e  4* 

where  the  terms  are  defined  by  the  formulas  above. 
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It  remains  to  decompose  the  vectors  to,,,  toe,  and  to,/,  with  respect  to 
e1(  e2,  and  e3 .  We  have  not  yet  used  the  fact  that  (p  =  ip  =  0.  If  cp  =  \p  =  0, 
then 

B((p  4-  d(p,  9,  \p)B~  l((p,  9,  ip) 

is  simply  a  rotation  around  the  axis  ez  through  an  angle  dtp,  so 

iov  =  (pez. 

Furthermore,  B(<p,  9  +  d9,  \p)B~  *(<p,  9,  ip)  is  simply  a  rotation  around  the 
axis  eN  =  ex  =  ej  through  an  angle  d9  in  the  case  (p  =  ip  =  0,  so 

o ig  =  8ev 

Finally,  B((p,9,ip  4-  dip)B~1((p,9,ip)  is  a  rotation  through  an  angle  dip 
around  the  axis  e3,  so 

=  <Ae3 . 

In  short,  for  (p  =  ip  =  0  we  have 

to  =  <pez  4-  ftej  +  i pe$ . 

But,  clearly,  for  (p  =  ip  =  0 

ez  =  e3  cos  9  +  e2  sin  9. 

So  the  components  of  the  angular  velocity  along  the  principal  axes  el5  e2, 
and  e3  are 

ojy  —  9  co2  =  <p  sin  9  co3  =  ip  +  (p  cos  9.  □ 

Since  T  =  If/jO)2  +  1 2(oj  +  /3co3),  the  kinetic  energy  for  <p  =  ip  =  0  is 
given  by  the  formula 

T  —~02  +  02  sin2  9)  4-  y  (ip  4-  (p  cos  9)2. 

But  the  kinetic  energy  cannot  depend  on  cp  and  ip:  these  are  cyclic  co¬ 
ordinates,  and  by  a  choice  of  origin  of  reference  for  cp  and  ip  which  does  not 
change  T  we  can  always  make  <p  =  0  and  \p  =  0.  Thus  the  formula  we  got 
for  the  kinetic  energy  is  true  for  all  <p  and  i p. 

In  this  way  we  obtain  the  lagrangian  function 

L  =  ^j(92  4-  (p2  sin2  9)  4-  y  (ip  4-  <p  cos  9)2  —  mgl  cos  9. 

C  Investigation  of  the  motion 

To  the  cyclic  coordinates  cp  and  ip  there  correspond  the  first  integrals 

—  =  Mz  =  (p(I !  sin2  9  4-  /3  cos2  9)  4-  i pl3  cos  9 
0(p 


8L 


=  M3  =  (pi 3  cos  9  4-  ipl3  ■ 
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Theorem.  The  inclination  8  of  the  axis  of  the  top  to  the  vertical  changes  with 
time  in  the  same  way  as  in  the  one-dimensional  system  with  energy 

e  =  ^  e2  +  u 


where  the  effective  potential  energy  is  given  by  the  formula 


Ueff 


(Mz  —  M3  cos  9)2 
21 1  sin2  9 


+  mgl  cos  9. 


Proof.  Following  the  general  theory,  we  express  (p  and  in  terms  of  M3 
and  Mz.  We  get  the  total  energy  of  the  system  as 


Ml 

4-  cos  ^ 


(Mz  —  M3  cos  0)2 


and 


Mz  —  M3  cos  9 

c o  =  — - ^ - . 

/j  sin2  9 

The  number  M\j2I3  —  E  —  E',  independent  of  6,  does  not  affect  the 
equation  for  6.  □ 

In  order  to  study  the  one-dimensional  system  above  it  is  convenient  to 
make  the  substitution  cos 9  —  u(  —  1  <u<  1). 

We  also  write 

Mz  M3  IE  2  mgl 

=  a  —  =  b  —  =  <x  ——  =  0  >0. 

*1  *1  *1 

Then  we  can  rewrite  the  law  of  conservation  of  energy  £'  as 

u2  =  f(u\ 

where  f{u)  =  (a  —  j5m)(  1  —  u2)  —  (a  —  bu)2,  and  the  law  of  variation  of 
the  azimuth  cp  as 


a  —  bu 


We  notice  that  f(u)  is  a  polynomial  of  degree  3,  /(  +  oo)  =  +00,  and 
/(±1)  =  -(a  +  b)2  <  0  if  a  /  ±b.  On  the  other  hand,  actual  motions 
correspond  to  constants  a,  b,  a,  and  /?  for  which  f(u)  >  0  for  some 
—  1  <  u  <  1.  Thus/(u)  has  exactly  two  real  roots  t/j  and  u2  on  the  interval 
-1  <  u  <  1  (and  one  for  u  >  1,  Figure  127).  Therefore,  the  inclination  9 
of  the  axis  of  the  top  changes  periodically  between  two  limit  values  9i  and  92 
(Figure  128).  This  periodic  change  in  inclination  is  called  nutation. 
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f 


We  now  consider  the  motion  of  the  azimuth  of  the  axis  of  the  top.  The 
point  of  intersection  of  the  axis  with  the  unit  sphere  moves  in  the  ring  between 
the  parallels  dY  and  8Z  ■  The  variation  of  the  azimuth  of  the  axis  is  determined 
by  the  equation 


a  —  bu 


If  the  root  u  of  the  equation  a  =  bu  lies  outside  of  (ulf  u2),  then  the  angle  (p 
varies  monotonically  and  the  axis  traces  a  curve  like  a  sinusoid  on  the  unit 
sphere  (Figure  128(a)).  If  the  root  u  of  the  equation  a  =  bu  lies  inside 
(w^  u2),  then  the  rate  of  change  of  (p  is  in  opposite  directions  on  the  parallels 
and  d2 ,  and  the  axis  traces  a  looping  curve  in  the  sphere  (Figure  128(b)). 
If  the  root  u  of  a  =  bu  lies  on  the  boundary  (e.g.,  u'  =  u2\  then  the  axis 
traces  a  curve  with  cusps  (Figure  128(c)). 

The  last  case,  although  exceptional,  is  observed  every  time  we  release 
the  axis  of  a  top  launched  at  inclination  82  without  initial  velocity;  the  top 
first  falls,  but  then  rises  again. 

The  azimuthal  motion  of  the  top  is  called  precession.  The  complete 
motion  of  the  top  consists  of  rotation  around  its  own  axis,  nutation,  and 
precession.  Each  of  the  three  motions  has  its  own  frequency.  If  the  frequencies 
are  incommensurable,  the  top  never  returns  to  its  initial  position,  although 
it  approaches  it  arbitrarily  closely. 


Figure  128  Path  of  the  top’s  axis  on  the  unit  sphere 
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31  Sleeping  tops  and  fast  tops 

The  formulas  obtained  in  Section  30  reduce  the  solution  of  the  equations  of  motion  of  a  top  to 
elliptic  integrals.  However,  qualitative  information  about  the  motion  is  usually  easy  to  obtain 
without  turning  to  quadrature. 

In  this  paragraph  we  investigate  the  stability  of  a  vertical  top  and  give  approximate  formulas 
for  the  motion  of  a  rapidly  spinning  top. 

A  Sleeping  tops 

We  consider  first  the  particular  solution  of  the  equations  of  motion  in 
which  the  axis  of  the  top  is  always  vertical  ( 9  =  0)  and  the  angular  velocity 
is  constant  (a  “sleeping”  top).  In  this  case,  clearly,  Mz  =  M3  =  I3co3 
(Figure  129). 


2 


Problem.  Show  that  a  stationary  rotation  around  the  vertical  axis  is  always  Liapunov  unstable. 


We  will  look  at  the  motion  of  the  axis  of  the  top,  and  not  of  the  top  itself. 
Will  the  axis  of  the  top  stably  remain  close  to  the  vertical,  i.e.,  will  9  remain 
small?  Expressing  the  effective  potential  energy  of  the  system 


Ut1I  = 


(Mz  —  M3  cos  Q)2 
21 1  sin2  9 


+  mgl  cos  9 


as  a  power  series  in  9,  we  find 


Uea 


n<o  i«>74) 

2/,02 


92 

+  •  •  •  —  mgl  — — —  C  +  A92  +  ■  •  • , 


=  <**3*3  _ 

8/,  2  ' 


If  A  >  0,  the  equilibrium  position  9  =  0  of  the  one-dimensional  system 
is  stable,  and  if  ^4  <  0  it  is  unstable.  Thus,  the  condition  for  stability  has  the 
form 

,  ^  4 rngll. 
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When  friction  reduces  the  velocity  of  a  sleeping  top  to  below  this  limit,  the 
top  wakes  up. 

Problem.  Show  that,  for  cof  >  4mglljl\,  the  axis  of  a  sleeping  top  is  stable  with  respect  to 
perturbations  which  change  the  values  of  M,  and  A/3,  as  well  as  9. 

B  Fast  tops 

A  top  is  called  fast  if  the  kinetic  energy  of  its  rotation  is  large  in  comparison 
with  its  potential  energy: 

>  mgl. 

It  is  clear  from  a  similarity  argument  that  multiplying  the  angular  velocity 
by  AT  is  exactly  equivalent  to  dividing  the  weight  by  N2. 

Theorem.  If,  while  the  initial  position  of  a  top  is  preserved,  the  angular  velocity 
is  multiplied  by  N,  then  the  trajectory  of  the  top  will  be  exactly  the  same  as 
if  the  angular  velocity  remained  as  it  was  and  the  acceleration  of  gravity 
g  were  divided  by  N2,  In  the  case  of  large  angular  velocity  the  trajectory 
clearly  goes  N  times  faster.51 

In  this  way  we  can  study  the  case  g  ->  0  and  apply  the  results  to  study 
the  case  co  -*  oo. 

To  begin,  we  consider  the  case  g  =  0,  i.e.,  the  motion  of  a  symmetric 
top  in  the  absence  of  gravity.  We  compare  two  descriptions  of  this  motion: 
Lagrange’s  (Section  30C)  and  Poinsot’s  (Section  29C). 

We  first  consider  Lagrange’s  equation  for  the  variation  of  the  angle  of 
inclination  6  of  the  top’s  axis. 

Lemma.  In  the  absence  of  gravity,  the  angle  90  satisfying  Mz  =  M3  cos  60 
is  a  stable  equilibrium  position  of  the  equation  of  motion  of  the  top’s  axis. 
The  frequency  of  small  oscillations  of  9  near  this  equilibrium  position  is 
equal  to 

_  1 3^3 

^nut  r 


Proof.  In  the  absence  of  gravity  the  effective  potential  energy  reduces  to 

_  (A M3  cos  A)2 
efr  21  ^  sin2  9 

This  nonnegative  function  has  the  minimum  value  of  zero  for  the  angle  9  —  0o  determined  by 
the  condition  Mz  =  M3  cos  90  (Figure  130).  Thus,  the  angle  of  inclination  90  of  the  top’s  axis 


51  Denote  by  4)  the  position  of  the  top  at  time  t  with  initial  condition  TSO( 3)  and 

gravitational  acceleration  g.  Then  the  theorem  says  that 

<Pg(t,  NZ>)  =  <PN-ifNt,  £). 
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t/cff 


Figure  130  Effective  potential  energy  of  a  top 


to  the  vertical  is  stably  stationary:  for  small  deviations  of  the  initial  angle  9  from  90,  there  will 
be  periodic  oscillations  of  9  near  90  (nutation).  The  frequency  of  these  oscillations  is  easily 
determined  by  the  following  general  formula:  the  frequency  to  of  small  oscillations  in  a  one- 
dimensional  system  with  energy 

ax2 

E  =  —  +  U(x),  U{x0)  =  min  U(x) 

is  given  (Section  22D)  by  the  formula 


(U 


2 


U"(xJ 

a 


The  energy  of  the  one-dimensional  system  describing  oscillations  of  the  inclination  of  the  top’s 
axis  is 


h 


A2  +  i/. 


For  0  =  0O  +  x  we  find  M.  —  M3  cos  0  =  M3( cos  0O  —  cos(0o  +  x))  =  M3x  sin  0O  +  0(x2) 

Ml-x2- sin2  0O  2  2  , 

+  o(x2)  =  ——x2+- 
Zi  ! 


EJ,n  — 


2/,  sin2  0O 

from  which  we  obtain  the  expression  for  the  frequency  of  nutation 

J3co3 


□ 


From  the  formula  =  ( Mz  —  A/3cos0)// j  sin2  9  it  is  clear  that,  for 
6  =  60,  the  azimuth  of  the  axis  does  not  change  with  time:  the  axis  is 
stationary.  The  azimuthal  motion  of  the  axis  under  small  deviations  of  6 
from  60  could  also  be  studied  with  the  help  of  this  formula,  but  we  will  deal 
with  it  differently. 

The  motion  of  a  top  in  the  absence  of  gravity  can  be  considered  in 
Poinsot’s  description.  Then  the  axis  of  the  top  rotates  uniformly  around  the 
angular  momentum  vector,  preserving  its  position  in  space.  Thus,  the  axis 
of  the  top  describes  a  circle  on  the  sphere  whose  center  corresponds  to  the 
angular  momentum  vector  (Figure  131). 

Remark.  Now  the  motion  of  the  top’s  axis,  which  according  to  Lagrange  was  called  nutation, 
is  called  precession  in  Poinsot’s  description  of  motion. 
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Figure  131  Comparison  of  the  descriptions  of  the  motion  of  a  top  according  to 
Lagrange  and  Poinsot 

This  means  that  the  formula  obtained  above  for  the  frequency  of  a  small 
nutation,  conut  =  I3co3/Iu  agrees  with  the  formula  for  the  frequency  of 
precession  a>  =  Mj in  Poinsot’s  description:  when  the  amplitude  of 
nutation  approaches  zero,  I3co3  -*  M. 

C  A  top  in  a  weak  field 

We  go  now  to  the  case  when  the  force  of  gravity  is  not  absent,  but  is  very 
small  (the  values  of  M2  and  M3  are  fixed).  In  this  case  a  term  mgl  cos  9, 
small  together  with  its  derivatives,  is  added  to  the  effective  potential  energy. 
We  will  show  that  this  term  slightly  changes  the  frequency  of  nutation. 

Lemma.  Suppose  that  the  function  f  ( jc)  has  a  minimum  at  x  —  0  and  Taylor  expansion  fix)  — 
Ax7- 12  +  ...,A  >0.  Suppose  that  the  function  h(x)  has  Taylor  expansion  h(x)  =  B  +  Cx  +  ■  ■ . 
Then,  for  sufficiently  small  e,  the  function  fix)  =  fix)  +  eh(x)  has  a  minimum  at  the  point 
( Figure  132 ) 

Ce 

xt  =  -  —  +  0(e2), 

f\. 

which  is  close  to  zero.  In  addition,  f'c'(x,)  =  A  +  O(e). 

Proof.  We  have  fj(x)  =  Ax  +  Ce  +  0(x2)  +  Ofex),  and  the  result  is  obtained  by  applying  the 
implicit  function  theorem  to  /c'(x).  D 


/ 


Figure  132  Displacement  of  the  minimum  under  a  small  change  of  the  function 
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6:  Rigid  bodies 


By  the  lemma,  the  effective  potential  energy  for  small  g  has  a  minimum 
8g  close  to  90,  and  at  this  point  U"  differs  slightly  from  U"(60).  Therefore,  the 
frequency  of  a  small  nutation  near  90  is  close  to  that  obtained  for  g  =  0: 


lim  co 


nut 


3  * 


D  A  rapidly  thrown  top 

We  now  consider  the  special  initial  conditions  when  we  release  the  axis  of 
the  top  without  an  initial  push  from  a  position  with  inclination  Q0  to  the 
vertical. 

Theorem.  If  the  axis  of  the  top  is  stationary  at  the  initial  moment  (cp  =  9  =  0) 
and  the  top  is  rotating  rapidly  around  its  axis  (co3  — >  oo),  which  is  inclined 
from  the  vertical  with  angle  90(MZ  =  M3  cos  90),  then  asymptotically,  as 
co3  co, 

1.  the  nutation  frequency  is  proportional  to  the  angular  velocity ; 

2.  the  amplitude  of  nutation  is  inversely  proportional  to  the  square  of  the 
angular  velocity ; 

3.  the  frequency  of  precession  is  inversely  proportional  to  the  angular 
velocity, 

4.  the  following  asymptotic  formulas  hold  (as  co3  -+  oo): 

/ 3  I\tngl  .  mgl 

®nut  ~  T”  W3  anut  ~  .2  2  S1°  ^0  ^prec  j 

1 1 

(here  f(a>3)  ~  g(a>3 )  i/lim^^^  (f/g)  =  1). 

For  the  proof,  we  look  at  the  case  when  the  initial  angular  velocity  is 
fixed,  but  g  -*  0.  Then  by  interpreting  the  formulas  with  the  aid  of  a  similarity 
argument  (cf.  Section  B),  we  obtain  the  theorem. 


We  already  know  from  Section  30C  that  under  our  initial  conditions  the  axis  of  the  top  traces 
a  curve  with  cusps  on  the  sphere. 


Feff 


Figure  133  Definition  of  the  amplitude  of  nutation 
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31 :  Sleeping  tops  and  fast  tops 


We  apply  the  lemma  to  locate  the  minimum  point  6g  of  the  effective  potential  energy.  We 
set  (Figure  1 33) 

9  =  90  +  x  cos  9  =  cos  90  —  x  sin  90  +  ■  ■  • . 

Then  we  obtain,  as  above,  the  Taylor  expansion  in  x  at  90 

2 


u , 


eff  Iff  =  0 


21, 


-  x2  4-  •  • ' ,  mgl  cos  6  =  mgl  cos  90  —  xmgl  sin  90  +  • 


Applying  the  lemma  to  f  -  Utf(\g=0,  g  =  s,h  =  ml  cos(0o  +  x),  we  find  that  the  minimum  of  the 
effective  potential  energy  Uc{{  is  attained  at  angle  of  inclination 

I, ml  sin  0n 

Qg  =  00  +  X,  *ff  =  - Tj  2  ~  &  +  °(9  )■ 


Thus  the  inclination  9  of  the  top’s  axis  will  oscillate  near  9g  (Figure  1 34).  But,  at  the  initial  moment. 


9  =  60  and  0  =  0.  This  means  that  90  corresponds  to  the  highest  position  of  the  axis  of  the  top. 
Thus,  for  small  g ,  the  amplitude  of  nutation  is  asymptotically  equal  to 


I {ml  sin  0o 

I\w\ 


g  (g  -  0). 


We  now  find  the  precessional  motion  of  the  axis.  From  the  general  formula 

Mz  -  A/3  cos  9 
^  I j  sin2  9 

for  Mz  -  M3  cos  90  and  9  =  90  +  x,  we  find  that  Mz  —  M3  cos  9  =  M3x  sin  90  +  •  •  •;  so 

m3 
<P  = 

I,  sin  0O 

But  x  oscillates  harmonically  between  0  and  2x9  (up  to  0(gf2)).  Therefore,  the  average  value  of 
the  velocity  of  precession  over  the  period  of  nutation  is  asymptotically  equal  to 


M, 


(p 


mgl 

- -  X  ^ - 

1 1  sin  90  9  I30J3 


(3-0). 


Problem.  Show  that 


g-> 0  r-*oo 


tmgl/I3co2 
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PART  III 

HAMILTONIAN  MECHANICS 


Hamiltonian  mechanics  is  geometry  in  phase  space.  Phase  space  has  the 
structure  of  a  symplectic  manifold.  The  group  of  symplectic  difleomorphisms 
acts  on  phase  space.  The  basic  concepts  and  theorems  of  hamiltonian 
mechanics  (even  when  formulated  in  terms  of  local  symplectic  coordinates) 
are  invariant  under  this  group  (and  under  the  larger  group  of  transformations 
which  also  transform  time). 

A  hamiltonian  mechanical  system  is  given  by  an  even-dimensional  mani¬ 
fold  (the  “phase  space”),  a  symplectic  structure  on  it  (the  “Poincare  integral 
invariant”)  and  a  function  on  it  (the  “hamiltonian  function”).  Every  one- 
parameter  group  of  symplectic  diffeomorphisms  of  the  phase  space  pre¬ 
serving  the  hamiltonian  function  is  associated  to  a  first  integral  of  the 
equations  of  motion. 

Lagrangian  mechanics  is  contained  in  hamiltonian  mechanics  as  a  special 
case  (the  phase  space  in  this  case  is  the  cotangent  bundle  of  the  configuration 
space,  and  the  hamiltonian  function  is  the  Legendre  transform  of  the  lagrang¬ 
ian  function). 

The  hamiltonian  point  of  view  allows  us  to  solve  completely  a  series  of 
mechanical  problems  which  do  not  yield  solutions  by  other  means  (for 
example,  the  problem  of  attraction  by  two  stationary  centers  and  the  problem 
of  geodesics  on  the  triaxial  ellipsoid).  The  hamiltonian  point  of  view  has 
even  greater  value  for  the  approximate  methods  of  perturbation  theory 
(celestial  mechanics),  for  understanding  the  general  character  of  motion 
in  complicated  mechanical  systems  (ergodic  theory,  statistical  mechanics) 
and  in  connection  with  other  areas  of  mathematical  physics  (optics,  quantum 
mechanics,  etc.). 


Differential  forms 


Exterior  differential  forms  arise  when  concepts  such  as  the  work  of  a  field 
along  a  path  and  the  flux  of  a  fluid  through  a  surface  are  generalized  to  higher 
dimensions. 

Hamiltonian  mechanics  cannot  be  understood  without  differential  forms. 
The  information  we  need  about  differential  forms  involves  exterior  multi¬ 
plication,  exterior  differentiation,  integration,  and  Stokes’  formula. 

32  Exterior  forms 

Here  we  define  exterior  algebraic  forms 

A  1-forms 

Let  R"  be  an  n-dimensional  real  vector  space.52  We  will  denote  vectors  in  this 
space  by  £,  T|, . . . . 

Definition.  A  form  of  degree  1  (or  a  1  -form)  is  a  linear  function  co :  R"  -»  R,  i.e., 
cu(A i£i  +  X2%2)  =  ^i“(^i)  +  Ai,  A2  g  R  and  G 

We  recall  the  basic  facts  about  1-forms  from  linear  algebra.  The  set  of  all 
1-forms  becomes  a  real  vector  space  if  we  define  the  sum  of  two  forms  by 

(®i  +  o>2){Z>)  =  (£), 

and  scalar  multiplication  by 

(Acu)(^)  =  ka{%). 


52  It  is  essential  to  note  that  we  do  not  fix  any  special  euclidean  structure  on  R".  In  some  examples 
we  use  such  a  structure;  in  these  cases  this  will  be  specifically  stated  (“euclidean  R"”). 
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7 :  Differential  forms 


The  space  of  1 -forms  on  R"  is  itself  n-dimensional,  and  is  also  called  the  dual 
space  (R")*. 

Suppose  that  we  have  chosen  a  linear  coordinate  system  xl5 . . . ,  x„  on  R". 
Each  coordinate  xf  is  itself  a  1-form.  These  n  1 -forms  are  linearly  independent. 
Therefore,  every  1-form  tu  has  the  form 

t o  =  ajXi  +  ■  ■  •  +  a„xni  afe  R. 

The  value  of  to  on  a  vector  ^  is  equal  to 

<o(£)  =  fliX!^)  +  •  •  *  + 

where  Xj(^),  . . . ,  x„(£)  are  the  components  of  4  in  the  chosen  coordinate 
system. 


Example.  If  a  uniform  force  field  F  is  given  on  euclidean  R3,  its  work  A  on  the  displacement  I; 
is  a  1-form  acting  on  E>  (Figure  135). 


F  (force) 

i 


«({)  -  IF,  f) 


k. 


(displacement) 

Figure  135  The  work  of  a  force  is  a  1-form  acting  on  the  displacement. 


B  2-forms 

Definition.  An  exterior  form  of  degree  2  (or  a  2-form)  is  a  function  on  pairs  of 
vectors  to2;  R"  x  R"  -»•  R,  which  is  bilinear  and  skew  symmetric: 

a>2(^ j  +  A2^2,  53)  =  Wfo,  §3)  +  A2to2(^2,  $3) 

"2(§1.§2)  =  “«>2(42»§iX 
VAl5  A2  g  R,  ^2,  e  R". 


Example  1.  Let  5(^l5  1;2)  be  the  oriented  area  of  the  parallelogram  constructed  on  the  vectors 
1;,  and  %2  of  the  oriented  euclidean  plane  R2,  i.e., 


£ll  fl2 
£21  £22 


where  £i  =  +  £12e2,  %2  =  <^2iei  +  £22*2- 


with  e1}  e2  a  basis  giving  the  orientation  on  R2. 

It  is  easy  to  see  that  1;2)  is  a  2-form  (Figure  136). 


Example  2.  Let  v  be  a  uniform  velocity  vector  field  for  a  fluid  in  three-dimensional  oriented 
euclidean  space  (Figure  137).  Then  the  flux  of  the  fluid  over  the  area  of  the  parallelogram 
\2  is  a  bilinear  skew  symmetric  function  of  5,  and  ^2 ,  i.e.,  a  2-form  defined  by  the  triple  scalar 
product 

«>2<5i  ll)  =  (*.$!•  W- 
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32 :  Exterior  forms 


Figure  137  Flux  of  a  fluid  through  a  surface  is  a  2-form. 

Example  3.  The  oriented  area  of  the  projection  of  the  parallelogram  with  sides  and  £2  on 
the  x,,  x2-plane  in  euclidean  R3  is  a  2-form. 

Problem  1.  Show  that  for  every  2-form  w2  on  R"  we  have 

co2(^,  §)  =  0,  VI;  e  Un. 

Solution.  By  skew  symmetry,  a>2(^,  =  ~u>2&  $). 


The  set  of  all  2-forms  on  IR"  becomes  a  real  vector  space  if  we  define  the 
addition  of  forms  by  the  formula 

K  +  <»2X§1>  +  oi2(%u  %2) 

and  multiplication  by  scalars  by  the  formula 


Problem  2.  Show  that  this  space  is  finite-dimensional,  and  find  its  dimension. 
Answer.  n(n  —  1  )/2 :  a  basis  is  shown  below. 


C  k-forms 

Definition.  An  exterior  form  of  degree  k,  or  a  fc-form,  is  a  function  of  k  vectors 
which  is  fc-linear  and  antisymmetric : 

aj(X^  +  A2«,  $2,  ...,$*)  =  $2>  •  •  • ,  Sk)  +  $2.  •  • . .  W 

where 


0  if  the  permutation  ilf ... ,  ik  is  even; 
1  if  the  permutation  ir, . . . ,  ik  is  odd. 
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7 :  Differential  forms 


Figure  138  Oriented  volume  is  a  3-form. 


Example  1.  The  oriented  volume  of  the  parallelepiped  with  edges  in  oriented  euclidean 

space  R"  is  an  n-form  (Figure  138). 


C  I  1  '  '  '  ?  In 

1  '  ^  nn 


where  4,  =  +  •  -  •  +  ^,„en  and  Cj, . . . ,  e„  are  a  basis  of  R". 


Example  2.  Let  R*  be  an  oriented  /c-plane  in  n-dimensional  euclidean  space  R".  Then  the 
t-dimensional  oriented  volume  of  the  projection  of  the  parallelepiped  with  edges 
%k  e  R"  onto  R*  is  a  fc-form  on  R". 

The  set  of  all  fc-forms  in  R"  form  a  real  vector  space  if  we  introduce 
operations  of  addition 

K  +  cozm  =  coAQ  +  0J2&),  k  =  {$1,  ■  ■  ■ ,  U.  6  R", 

and  multiplication  by  scalars 


(^)($)  - 


Problem  3.  Show  that  this  vector  space  is  finite-dimensional  and  find  its  dimension. 

Answer.  C*;  a  basis  is  shown  below. 

D  The  exterior  product  of  two  1 -forms 

We  now  introduce  one  more  operation:  exterior  multiplication  of  forms. 
If  c ok  is  a  A: -form  and  col  is  an  /-form  on  R",  then  their  exterior  product  cok  a  col 
will  be  a  k  +  /-form.  We  first  define  the  exterior  product  of  1 -forms,  which 
associates  to  every  pair  of  1 -forms  <w1}  co2  on  R"  a  2-form  col  a  co2  on  R". 

Let  $  be  a  vector  in  R".  Given  two  1 -forms  co1  and  (o2,  we  can  define  a 
mapping  of  R"  to  the  plane  R  x  R  by  associating  to  2;  e  R"  the  vector  co(^) 
with  components  and  co2( in  the  plane  with  coordinates  cou  (o2 

(Figure  139). 


Definition.  The  value  of  the  exterior  product  co1  a  co2  on  the  pair  of  vectors 
1*  lt  1*2  e  R"  is  the  oriented  area  of  the  image  of  the  parallelogram  with  sides 
co^i)  and  co(^2)  on  the  (Uj,  co2-plane: 


(®1  A  OJ2)(£i,  fc2)  = 


«i(Q 
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32 :  Exterior  forms 


Figure  139  Definition  of  the  exterior  product  of  two  1 -forms 


Problem  4.  Show  that  a  a>2  really  is  a  2-form. 
Problem  5.  Show  that  the  mapping 


(tO],  co2)  — ►  co ,  A  U)2 


is  bilinear  and  skew  symmetric: 


0)l  A  OJ2  =  —  CO 2  A  tU[, 

(Xo)\  +  A  0)2  =  Xio\  A  (1)2  +  X'w'l  A  0)2. 

Hint.  The  determinant  is  bilinear  and  skew-symmetric  not  only  with  respect  to  rows,  but 
also  with  respect  to  columns. 

Now  suppose  we  have  chosen  a  system  of  linear  coordinates  on  IR",  i.e.,  we 
are  given  n  independent  1 -forms  xu  . . . ,  x„.  We  will  call  these  forms  basic. 

The  exterior  products  of  the  basic  forms  are  the  2-forms  xt  a  Xj.  By  skew- 
symmetry,  X{  a  xf  =  0  and  x,  a  Xj  =  —x}/\  xt.  The  geometric  meaning  of 
the  form  xt  a  xj  is  very  simple:  its  value  on  the  pair  of  vectors  2^,  is  equal 
to  the  oriented  area  of  the  image  of  the  parallelogram  on  the  coordinate 

plane  xu  xj  under  the  projection  parallel  to  the  remaining  coordinate 
directions. 

Problem  6.  Show  that  the  Cl  =  n(n  —  l)/2  forms  x,  a  x/i  <  j)  are  linearly  independent. 

In  particular,  in  three-dimensional  euclidean  space  (x1?  x2,  x3),  the  area 
of  the  projection  on  the  (xl5  x2)-plane  is  Xi  a  x2,  on  the  (x2,  x3)-plane  it  is 
x2  a  x3 ,  and  on  the  (x3,  x^-plane  it  is  x3  a  Xj. 


Problem  7.  Show  that  every  2-form  in  the  three-dimensional  space  (x,,  x2,  x3)  is  of  the  form 

P  v2  a  x3  +  Qx  3  ax,  +  fix,  a  x2  . 
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7:  Differential  forms 


Problem  8.  Show  that  every  2-form  on  the  n-dimensional  space  with  coordinates  xu...,xn 
can  be  uniquely  represented  in  the  form 

u)2  =  Lauxi  A  XJ ■ 

t<J 

Hint.  Let  e,  be  the  i-th  basis  vector,  i.e.,  x,(e,)  =  1,  x/e,)  =  0  for  i  ^  j.  Look  at  the  value  of 
the  form  or2  on  the  pair  e,,  e;.  Then 


au  =  <y2(e,  ,  e/). 


E  Exterior  monomials 

Suppose  that  we  are  given  k  1 -forms  colf  . cok.  We  define  their  exterior 
product  a  •  •  ■  a  cok. 


Definition.  Set 


"i(^) 


^t) 


In  other  words,  the  value  of  a  product  of  1-forms  on  the  parallelepiped 
. . . ,  §k  is  equal  to  the  oriented  volume  of  the  image  of  the  parallelepiped 
in  the  oriented  euclidean  coordinate  space  [Rk  under  the  mapping  ^  -► 
(co.a), . . . ,  cokm 


Problem  9.  Show  that  a  ■  ■  ■  a  is  a  k-form. 


Problem  10.  Show  that  the  operation  of  exterior  product  of  1-forms  gives  a  multi-linear  skew- 
symmetric  mapping 


(co,, . . . ,  (ok)  -►  «>!  a  . . .  A  (tik. 


In  other  words, 

(A'cn'i  +  A" CD])  A  102  A  •  ■  ■  A  U)k  =  Xto\  A  «>2  A  ■  •  ■  A  0)k  +  A"a)','  A  (U2  A  A  U)k 


and 


where 


(U„  A  "■  A  OJik  =  (—  1 )'  CO !  A  '  '  ■  A  (Ok, 


V 


0  if  the  permutation  i,, . . . ,  ik  is  even, 
1  if  the  permutation  i^, . . . ,  ik  is  odd. 


Now  consider  a  coordinate  system  on  H?H  given  by  the  basic  forms  xu  . . . , 
x„ .  The  exterior  product  of  k  basic  forms 

Xjj  A  •  •  ■  A  Xjk,  1  — 

is  the  oriented  volume  of  the  image  of  a  k -parallelepiped  on  the  fc -plane 
(xj,,  ...,  xik)  under  the  projection  parallel  to  the  remaining  coordinate 
directions. 
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32 ;  Exterior  forms 


Problem  1 1.  Show  that,  if  two  of  the  indices  iu  . . . ,  ik  are  the  same,  then  the  form  xfl  a  ■  •  ■  a  xik 
is  zero. 

Problem  12.  Show  that  the  forms 

x„  a  •  •  ■  a  xik,  where  1  <  it  <  i2  <  ■  ■  ■  <  ik  <  n, 
are  linearly  independent. 

The  number  of  such  forms  is  clearly  Ck.  We  will  call  them  basic  k-forms. 

Problem  13.  Show  that  every  k -form  on  R"  can  be  uniquely  represented  as  a  linear  combination 
of  basic  forms : 

a)k  =  I  ^ . fcXi,  A  A  xlk. 

1  <i,  <  <i„<n 

Hint.  at,,.... ik  =  a/(e,„  , . . ,  e  J. 

It  follows  as  a  result  of  this  problem  that  the  dimension  of  the  vector  space 
of  /c-forms  on  R"  is  equal  to  C|j.  In  particular,  for  k  =  n,  C*  =  1,  from  which 
follows 

Corollary.  Every  n-form  on  R"  is  either  the  oriented  volume  of  a  parallelepiped 
with  some  choice  of  unit  volume,  or  zero : 

(lin  =  a  ■  a  -  ■  ■  a  xn. 

Problem  14.  Show  that  every  fc-form  on  R"  with  k  >  n  is  zero. 


We  now  consider  the  product  of  a  fc-form  cok  and  an  /-form  a>‘.  First, 
suppose  that  we  are  given  two  monomials 

of  =  o)i  a  ■  •  •  a  cok  and  of  —  a>k  +  1  a  ■  •  •  a  cok+l, 

where  colt . . . ,  a>k  +  l  are  1-forms.  We  define  their  product  cok  a  oj1  to  be  the 
monomial 

(0J1  A  •••  A  (Ok)  A  A  A  C0k  +  l) 

~  OJ1  A  •••  A  (Ok  A  Cl4+1  A  A  COk  +  l. 

Problem  15.  Show  that  the  product  of  monomials  is  associative: 

(a/  A  CD1)  A  ft)"  =  (tt*  A  (a}‘  A  COm) 

and  skew-commutative : 


U)k  A  ll)‘  =  (—  1)*W  A  O)*. 

Hint.  In  order  to  move  each  of  the  l  factors  of  vJ  forward,  we  need  k  inversions  with  the 
k  factors  of  <//. 

Remark.  It  is  useful  to  remember  that  skew-commutativity  means  commutativity  only  if 
one  of  the  degrees  k  and  l  is  even,  and  anti-commutativity  if  both  degrees  k  and  /  are  odd. 
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33  Exterior  multiplication 

We  define  here  the  operation  of  exterior  multiplication  of  forms  and  show  that  it  is  skew- 
commutative,  distributive,  and  associative. 


A  Definition  of  exterior  multiplication 

We  now  define  the  exterior  multiplication  of  an  arbitrary  fc-form  of  by  an 
arbitrary  /-form  c ol.  The  result  to*  a  to*  will  be  a  k  +  /-form.  The  operation  of 
multiplication  turns  out  to  be: 

1.  skew-commutative :  of  a  (ol  =  (— 1  )klcol  a  (ok; 

2.  distributive:  +  Ajto*)  a  of  =  Ajcu*  a  col  +  X2(ok  a  of\ 

3.  associative:  (of  a  of)  a  com  —  of  a  (of  a  com). 

Definition.  The  exterior  product  of  a  of  of  a  k-form  of  on  IR"  with  an 
/-form  of  on  If?"  is  the  k  +  /-form  on  R"  whose  value  on  the  k  +  l  vectors 

5i,  ■  •  •  >  1,  •  •  • ,  4k  +  t e  is  equal  t0 

(1)  (C0k  A  . €k+l)  =  X(— . 

where  <  ••  •  <  i*  and  ji  <  ■  ■  ■  <  j\\ (i'i, . . . ,  iWi>  *  ■  •  Ji)  is  a  Permutation 
of  the  numbers  (1,  2, . . . ,  k  +  /);  and 

JT  if  this  permutation  is  odd; 

V  —  1 0  if  this  permutation  is  even. 


In  other  words,  every  partition  of  the  k  +  l  vectors  £i, . . . ,  into  two 
groups  (of  k  and  of  /  vectors)  gives  one  term  in  our  sum  (1).  This  term  is  equal 
to  the  product  of  the  value  of  the  fc-form  o)k  on  the  k  vectors  of  the  first  group 
with  the  value  of  the  /-form  of  on  the  /  vectors  of  the  second  group,  with  sign 
+  or  —  depending  on  how  the  vectors  are  ordered  in  the  groups.  If  they  are 
ordered  in  such  a  way  that  the  k  vectors  of  the  first  group  and  the  /  vectors  of 
the  second  group  written  in  succession  form  an  even  permutation  of  the 
vectors  %2->  •  *  ■  >  ^U  +  i*  then  we  take  the  sign  to  be  +  ,  and  if  they  form  an 
odd  permutation  we  take  the  sign  to  be  — . 

Example.  If  k  =  /  =  1,  then  there  are  just  two  partitions:  Jji,  \2  and  ^2>  £i- 
Therefore, 


(<Ul  A  <02X5l,  ^2)  —  cal(^l)ta2(^2)  a,2(^l)£0l(^2)» 

which  agrees  with  the  definition  of  multiplication  of  1 -forms  in  Section  32. 

Problem  1.  Show  that  the  definition  above  actually  defines  a  A:  +  /-form  (i.e.,  that  the  value  of 
(o)k  a  to1)  (4j,  . . . ,  lU+i)  depends  linearly  and  skew-symmetrically  on  the  vectors  %)■ 
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B  Properties  of  the  exterior  product 

Theorem.  The  exterior  multiplication  of  forms  defined  above  is  skew-com¬ 
mutative,  distributive,  and  associative.  For  monomials  it  coincides  with  the 
multiplication  defined  in  Section  32. 


The  proof  of  skew-commutativity  is  based  on  the  simplest  properties  of 
even  and  odd  permutations  (cf.  the  problem  at  the  end  of  Section  32)  and  will 
be  left  to  the  reader. 

Distributivity  follows  from  the  fact  that  every  term  in  (1)  is  linear  with 
respect  to  cok  and  of 

The  proof  of  associativity  requires  a  little  more  combinatorics.  Since  the 
corresponding  arguments  are  customarily  carried  out  in  algebra  courses  for 
the  proof  of  Laplace’s  theorem  on  the  expansion  of  a  determinant  by  column 
minors,  we  may  use  this  theorem.53 

We  begin  with  the  following  observation:  if  associativity  is  proved  for  the 
terms  of  a  sum,  then  it  is  also  true  for  the  sum,  i.e., 


implies 


(Of  A  OJ2)  A  033  —  Of  A  (03 2  A  CU3)] 

(a>'(  a  0J2)  a  ft; 3  =  (£>'[  a  (oj2  a  co3)J 

((flj'l  +  A  0)2)  A  OJ3  =  (co\  +  03 1)  A  (0J2  A  OJ3). 

For,  by  distributivity,  which  has  already  been  proved,  we  have 


((a>i  +  03j)  A  03 2 )  a  a) 3  —  ((a/j  A  03 2)  A  C'J3)  +  ((aj'i  A  032)  a  cu3), 

C 03\  +  03\)  A  (032  A  Q33)  —  (03\  A  (o32  A  C03))  +  (ftjj  A  (0J2  A  CU3)). 

We  already  know  from  Section  32  (Problem  13)  that  every  form  on  R"  is  a 
sum  of  monomials;  therefore,  it  is  enough  to  show  associativity  for  multi¬ 
plication  of  monomials. 

Since  we  have  not  yet  proved  the  equivalence  of  the  definition  in  Section 
32  of  multiplication  of  k  1-forms  with  the  general  definition  (1),  we  will 
temporarily  denote  the  multiplication  of  k  1 -forms  by  the  symbol  a  ,  so  that 
our  monomials  have  the  form 


03k  ~  03 1  A  ■  ■  ■  A  03k  and  03l  =  03k+1  A  •  •  ■  A  03k  +  l, 

where  03lf . . . ,  03k+l  are  1 -forms. 


A  direct  proof  of  associativity  (also  containing  a  proof  of  Laplace’s  theorem)  consists  of 
checking  the  signs  in  the  identity 


((a/  a  ajt)  a  ,  ^  +  i  +  J  =  Z  ±  •  ■  ■ ,  ■  •  ■ .  ■  ■  •  «UJ, 

where  (!<■■■<  ik,j,  <  ■  ■  ■  <  jh  ht  <  ■  ■  ■  <  hm\  . . . ,  hm)  is  a  permutation  of  the  numbers 
(1, . . . ,  k  +  /  +  m). 
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Lemma.  The  exterior  product  of  two  monomials  is  a  monomial: 

(ct)j  A  A  (Ok )  A  (60,,+  !  A  A  0>k+ ,) 

=  60 !  A  ■  •  •  A  C0k  A  C0k  +  !  A  ■  •  •  A  60*  +  , . 

Proof.  We  calculate  the  values  of  the  left  and  right  sides  on  k  +  /  vectors 
^!, . . . ,  ^k  +  i.  The  value  of  the  left  side,  by  formula  (1),  is  equal  to  the  sum  of 

the  products 

I±  det  1 60,(2;,  J  |  •  det  I  to, (^J| 

1  £  i  <  fc  k  <  i  <  k  +  I 

of  the  minors  of  the  first  k  columns  of  the  determinant  of  order  k  +  l  and  the 
remaining  minors.  Laplace’s  theorem  on  the  expansion  by  minors  of  the 
first  k  columns  asserts  exactly  that  this  sum,  with  the  same  rule  of  sign  choice 
as  in  Definition  (1),  is  equal  to  the  determinant  det  |  <o,(^)  | .  □ 

It  follows  from  the  lemma  that  the  operations  a  and  a  coincide:  we  get, 
in  turn, 

60,  A  602  =  60j  A  60 2 , 

CO!  A  CO 2  A  60 3  —  (CM,  A  0J2)  A  ffl3  =  (cl>i  A  0)2)  A  C03, 

CO,  A  C02  A  •  •  •  A  (Ok  =  (•  ■  •  ((60 !  A  6 1)2)  A  ^3)  A  •  •  ■  A  Qik). 

The  associativity  of  a  -multiplication  of  monomials  therefore  follows  from 
the  obvious  associativity  of  a  -multiplication  of  1 -forms.  Thus,  in  view  of  the 
observation  made  above,  associativity  is  proved  in  the  general  case. 


Problem  2.  Show  that  the  exterior  square  of  a  1-form,  or,  in  general,  of  a  form  of  odd  order,  is 
equal  to  zero:  co*  a  a/  =  0  if  k  is  odd. 


Example  1.  Consider  a  coordinate  system  p . p„ ,  <?,. 


•  <tn 


on 


?2"  and  the  2-form 


i  p. 


[Geometrically,  this  form  signifies  the  sum  of  the  oriented  areas  of  the  projection  of  a  paral¬ 
lelogram  on  the  n  two-dimensional  coordinate  planes  (p,,  qx) . (p„,  qn)-  Later,  we  will  see 

that  the  2-form  oj2  has  a  special  meaning  for  hamiltonian  mechanics.  It  can  be  shown  that  every 
nondegenerate54  2-form  on  R2"  has  the  form  to2  in  some  coordinate  system  (p <?„)•] 


Problem  3.  Find  the  exterior  square  of  the  2-form  co2 


Answer. 

OJ2  A  CD2  =  —  2£p,  A  Pj  A  q,  A  qr 
1  >  J 


Problem  4.  Find  the  exterior  fe-th  power  of  w2. 

Answer. 

u>2  A  W2  A  ■  ■  ■  A  to2  —  +k  !  ^  Pu  A  •  '  '  A  P,k  A  q>i  A  '  •  '  A  4ij,' 

'  y  i]  <■  ■ 

k 

54  A  bilinear  form  co2  is  nondegenerate  if  ^  0,  3q:  ca2(4,  q)  ^  0.  See  Section  41B. 
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In  particular, 


AW  —  +  n  1  p l  A  ■■■  A  A  i/|  A 


‘In 


is,  up  to  a  factor,  the  volume  of  a  2n-dimensional  parallelepiped  in  R2n. 


Example  2.  Consider  the  oriented  euclidean  space  R3.  Every  vector  A  e  R3  determines  a  1-form 
by  =  (A,  (scalar  product)  and  a  2-form  w\  by 

£.2)  =  (A,  £],  ^2)  (triple  scalar  product). 

Problem  5.  Show  that  the  maps  A  -*  (o'A  and  A  -*•  cuA  establish  isomorphisms  of  the  linear  space 
RJ  of  vectors  A  with  the  linear  spaces  of  1 -forms  on  R3  and  2-forms  on  R3.  If  we  choose  an 
orthonormal  oriented  coordinate  system  (x,,  x2,  x.,)  on  R3,  then 

—  A^X\  +  A  2X2  4-  ^3X3 
and 


(t)A  =  Alx2  a  x3  +  A2x3  a  X!  +  A3xt  a  x2. 

Remark.  Thus  the  isomorphisms  do  not  depend  on  the  choice  of  the  orthonormal  oriented 
coordinate  system  (xl5x2,  x3).  But  they  do  depend  on  the  choice  of  the  euclidean  structure 
on  R3,  and  the  isomorphism  A  -»  u>\  also  depends  on  the  orientation  (coming  implicitly  in  the 
definition  of  triple  scalar  product). 

Problem  6.  Show  that,  under  the  isomorphisms  established  above,  the  exterior  product  of 
1 -forms  becomes  the  vector  product  in  R3,  i.e.,  that 

<x>i  a  aiB  =  cofA  B,  for  any  A.  B  e  R3. 

In  this  way  the  exterior  product  of  1-forms  can  be  considered  as  an  extension  of  the  vector 
product  in  R3  to  higher  dimensions.  However,  in  the  n-dimensional  case,  the  product  is  not  a 
vector  in  the  same  space:  the  space  of  2-forms  on  R"  is  isomorphic  to  R"  only  for  n  =  3. 

Problem  7.  Show  that,  under  the  isomorphisms  established  above,  the  exterior  product  of  a 
1-form  and  a  2-form  becomes  the  scalar  product  of  vectors  in  R3: 

<0a  a  coj  =  (A,  B).x,  a  x2  a  x3. 

C  Behavior  under  mappings 

Let  /:  IRm  ->  (Rn  be  a  linear  map,  and  a>k  an  exterior  Worm  on  [Rn.  Then 
there  is  a  Worm  f*a>k  on  R"1,  whose  value  on  the  k  vectors  . . . ,  6  FT" 

is  equal  to  the  value  of  a>k  on  their  images : 

Problem  8.  Verify  that./*ro*  is  an  exterior  form. 

Problem  9.  Verify  that  /*  is  a  linear  operator  from  the  space  of  &-forms  on  R"  to  the  space  of 
k -forms  on  Rm  (the  star  superscript  means  that  /  *  acts  in  the  opposite  direction  from  / ). 

Problem  10.  Let/:  Rm  ->  R"  and  g:  R"  ->•  RL  Verify  that  ( g  °  /')*  =  f*  eg*. 

Problem  1 1.  Verify  that  f*  preserves  exterior  multiplication:  f*(wk  a  to')  =  (f*wk)  a  (f*(o‘). 
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34  Differential  forms 

We  give  here  the  definition  of  differential  forms  on  differentiable  manifolds. 

A  Differential  1-forms 

The  simplest  example  of  a  differential  form  is  the  differential  of  a  function. 

Example.  Consider  the  function  y  =  /(x)  =  x2.  Its  differential  df  =  2x  dx  depends  on  the 
point  x  and  on  the  “increment  of  the  argument,”  i.e.,  on  the  tangent  vector  4  to  the  x  axis.  We 
fix  the  point  x.  Then  the  differential  of  the  function  at  x,  df  L,  depends  linearly  on  £.  So,  if  x  =  1 
and  the  coordinate  of  the  tangent  vector  %  is  equal  to  1,  then  df  =  2,  and  if  the  coordinate  of 
^  is  equal  to  10,  then  df  —  20  (Figure  140). 


Figure  140  Differential  of  a  function 

Let  /:  M  -*>  R  be  a  differentiable  function  on  the  manifold  M  (we  can 
imagine  a  “function  of  many  variables”/1  ^)-  The  differential  df  I, 
of/ at  x  is  a  linear  map 


dfx :  T Mx  — ►  [R 

of  the  tangent  space  to  M  at  x  into  the  real  line.  We  recall  from  Section  18F  the 
definition  of  this  map: 

Let  £  e  TMX  be  the  velocity  vector  of  the  curve  x(t):  U-*  M;  x(0)  =  x 
and  x(0)  =  Then,  by  definition, 


-  * 


/(*(*))• 

(  =  0 


Problem  1.  Let  %  be  the  velocity  vector  of  the  plane  curve  x(t)  =  cos  f,  y(t)  -  sin  t  at  t  -  0. 
Calculate  the  values  of  the  differentials  dx  and  dy  of  the  functions  x  and  y  cn  the  vector  £ 
(Figure  141). 

Answer.  4x|(1, <»(!;)  =  0.  4y|(i.o>(£)  =  1 

Note  that  the  differential  of  a  function /at  a  point  x  e  M  is  a  1-form  dfx  on 
the  tangent  space  TMX. 
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y 


The  differential  df  of / on  the  manifold  M  is  a  smooth  map  of  the  tangent 
bundle  TM  to  the  line 


df:TM~*M  ^TM  =  (J  TMxj. 

This  map  is  differentiable  and  is  linear  on  each  tangent  space  TMX  a  TM. 

Definition.  A  differential  form  of  degree  1  (or  a  1  -form)  on  a  manifold  M  is  a 
smooth  map 

or.  TM  -*■  R 

of  the  tangent  bundle  of  M  to  the  line,  linear  on  each  tangent  space  TMX. 

One  could  say  that  a  differential  l -form  on  M  is  an  algebraic  l -form  on 
TMX  which  is  “ differentiable  with  respect  to  x.” 


Problem  2.  Show  that  every  differential  1-form  on  the  line  is  the  differential  of  some  function. 

Problem  3.  Find  differential  1-forms  on  the  circle  and  the  plane  which  are  not  the  differential 
of  any  function. 


B  The  general  form  of  a  differential  1-form  on  D?" 

We  take  as  our  manifold  M  a  vector  space  with  coordinates  xu  x„. 
Recall  that  the  components  fi,  of  a  tangent  vector  E,  e  TUX  are  the 

values  of  the  differentials  dxu  . . . ,  dxn  on  the  vector  These  n  1 -forms  on 
TK  are  linearly  independent.  Thus  the  1 -forms  dxt, . . . ,  dxn  form  a  basis  for 
the  n-dimensional  space  of  1-forms  on  TUnx,  and  every  1-form  on  TU"X  can 

be  uniquely  written  in  the  form  at  dxx  + - 1 -  andxn,  where  the  at  are  real 

coefficients.  Now  let  o  be  an  arbitrary  differential  1-form  on  Rn.  At  every 
point  x  it  can  be  expanded  uniquely  in  the  basis  dxlt...,  dxn.  From  this  we  get : 

Theorem.  Every  differential  l-form  on  the  space  M"  with  a  given  coordinate 
system  xu  x„  can  be  written  uniquely  in  the  form 

(o  —  a1(x)dxl  +  ■  •  *  +  a„(x)dxn, 

where  the  coefficients  a,(x)  are  smooth  functions. 
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*2 


Figure  142  Problem  4 


Problem  4.  Calculate  the  value  ofthe  forms  aij  =  dxlt  cu2  =  xldx2,  andcu3  =  dr2(r2  =  x\  +  xj) 
on  the  vectors  1^,  £2,  and  43  (Figure  142). 

Answer. 


5. 

^2 

^3 

(O, 

0 

-1 

1 

<a2 

0 

-2 

-2 

"3 

0 

-8 

0 

Problem  5.  Let  xt, . . . ,  x„  be  functions  on  a  manifold  M  forming  a  local  coordinate  system  in 
some  region.  Show  that  every  1-form  on  this  region  can  be  uniquely  written  in  the  form 
to  =  ay{x)  dxy  +  ■  •  ■  +  a„(x )  dxn. 


C  Differential  k-forms 

Definition.  A  differential  k-form  Co*!,  at  a  point  xofa  manifold  M  is  an  exterior 
k-form  on  the  tangent  space  TMX  to  M  at  x,  i.e.,  a  k-linear  skew-symmetric 
function  of  k  vectors  tangent  to  M  at  x. 

If  such  a  form  of  |x  is  given  at  every  point  x  of  the  manifold  M  and  if  it  is 
differentiable,  then  we  say  that  we  are  given  a  k-form  of  on  the  manifold  M. 


Problem  6.  Put  a  natural  differentiable  manifold  structure  on  the  set  whose  elements  are  ^-tuples 
of  vectors  tangent  to  M  at  some  point  x. 


A  differential  fc-form  is  a  smooth  map  from  the  manifold  of  Problem  6  to 
the  line. 

Problem  7.  Show  that  the  Jt-forms  on  M  form  a  vector  space  (infinite-dimensional  if  k  does  not 
exceed  the  dimension  of  M). 


Differential  forms  can  be  multiplied  by  functions  as  well  as  by  numbers. 
Therefore,  the  set  of  C00  differential  Worms  has  a  natural  structure  as  a 
module  over  the  ring  of  infinitely  differentiable  real  functions  on  M. 
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D  The  general  form  of  a  differential  k-form  on  IR" 

Take  as  the  manifold  M  the  vector  space  IR"  with  fixed  coordinate  functions 
Xi, . . . ,  x„:  IR"  -*■  !R.  Fix  a  point  x.  We  saw  above  that  the  n  1-forms  dxx, . . . , 
dx„  form  a  basis  of  the  space  of  1-forms  on  the  tangent  space  TIR£. 

Consider  exterior  products  of  the  basic  forms : 

dxh  a  •  •  •  a  dxik ,  ii  <  •  •  •  <  ik. 

In  Section  32  we  saw  that  these  Ck  k-forms  form  a  basis  of  the  space  of  exterior 
k-forms  on  TIR*.  Therefore,  every  exterior  k-form  on  TUI  can  be  written 
uniquely  in  the  form 

Z  A  '  A  dxik. 

i  1  <  <‘k 

Now  let  co  be  an  arbitrary  differential  k-form  on  IR".  At  every  point  x  it 
can  be  uniquely  expressed  in  terms  of  the  basis  above.  From  this  follows: 

Theorem.  Every  differential  k-form  on  the  space  IR"  with  a  given  coordinate 
system  xl5 . . . ,  x„  can  be  written  uniquely  in  the  form 

cok  =  Z  a, ikOOdxt,  a  a  dxik, 

ii  <"•<!* 

where  the  au  ilt(x)  are  smooth  functions  on  IR". 

Problem  8.  Calculate  the  value  of  the  forms  <ol  =  dxl  a  dx2 ,  oj2  =  xt  dx,  a  dx2  —  x2  dx2  a 
dxx,  and  =  r  dr  a  d<p  (where  x,  =  r  cos  <p  and  x2  =  r  sin  <p)  on  the  pairs  of  vectors  (£,,  rji), 
(^■h),  and  (§3,  n3)  (Figure  143). 

Answer. 


»l2) 

"l 

l 

1 

-1 

cd2 

2 

1 

-3 

1 

1 

- 1 

*2 


i 

h 

^3 

i 

* 2 

£ 

Figure  143  Problem  8 
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Problem  9.  Calculate  the  value  of  the  forms  tol  =  dx2  a  dx3,  co2  =  x,  dx3  a  dx2,  and 
cu3  =  dx3  a  dr2  (r2  =  x\  +  x\  +  xf),  on  the  pair  of  vectors  Z,  =  (1,  i,  1),  n  =  (1,  2,  3)  at  the 
point  x  =  (2,  0,  0). 

Answer.  Mj  =  1,  cu2  =  —  2,  o>3  =  —  8. 

Problem  10.  Let  x{, . . . ,  xn:  M  -*  R  be  functions  on  a  manifold  which  form  a  local  coordinate 
system  on  some  region.  Show  that  every  differential  form  on  this  region  can  be  written  uniquely  in 
the  form 


Wk  =  Z  . ,k(x)dx,,  A  A  dxik. 

ii  <•"<'* 


Example.  Change  of  variables  in  a  form.  Suppose  that  we  are  given  two 
coordinate  systems  on  R3 :  xl5  x2 ,  x3  and  y t,  y2 ,  y3  •  Let  co  be  a  2-form  on  R3. 
Then,  by  the  theorem  above,  co  can  be  written  in  the  system  of  x-coordinates 
as  co  —  Xl  dx2  a  dx3  +  X2  dx3  a  dxl  4-  X3  dxx  a  dx2,  where  Xx,  X2, 
and  X3  are  functions  of  xl5  x2 ,  and  x3 ,  and  in  the  system  of  y-coordinates  as 
co  =  Yj  dy2  a  dy3  +  Y2  dy3  a  dyx  +  Y3  dyx  a  dy2,  where  Yi,  Y2,  and  T3 
are  functions  of  yx,  y2,  and  y3. 


Problem  1 1 .  Given  the  form  written  in  the  x-coordinates  (i.e.,  the  X,)  and  the  change  of  variables 
formulas  x  =  x(y),  write  the  form  in  y-coordinates,  i.e.,  find  Y. 

Solution.  We  have  dx:  =  (tbq/dy,)  dyt  +  (dxt /0y2)  dy2  +  (Sx,/dy3)  dy3.  Therefore, 


dx2  a  dx3  = 


cx2  dx2 

.-dy,.  +  —  ay  2 
Gy  l  d}'2 


dx2 

<5>’3 


dy. 


£x3  £x3 

dy,  +  T—  dy 
<?>T  cy2 


£x3 


2  +  7  dy3 

^>3 


from  which  we  get 


Y3  =  Xt 


D(x2,x3) 

D(yi ,  y2) 


+  X2 


D(x3  ,  xt) 

y2) 


+  *3 


D(x i,  x2) 

D(yi,y2) 


E  Appendix.  Differential  forms  in  three-dimensional  spaces 

Let  M  be  a  three-dimensional  oriented  riemannian  manifold  (in  all  future 
examples  M  will  be  euclidean  three-space  R3).  Let  xu  x2,  and  x3  be  local 
coordinates,  and  let  the  square  of  the  length  element  have  the  form 

ds2  —  Et  dx\  +  E2  dx2  +  E3  dxj 

(i.e.,  the  coordinate  system  is  triply  orthogonal). 


Problem  12.  Find  £,,  £2,  and  £3  for  cartesian  coordinates  x,  y,  z,  for  cylindrical  coordinates 
r,  tp ,  z  and  for  spherical  coordinates  R ,  <p,  9  in  the  euclidean  space  R3  (Figure  144). 


Answer. 

ds2  =  dx2  +  dy 2  +  dz2  =  dr1  +  r2  dtp2  +  dz2  =  dR2  +  R2 cos1  0  dtp2  +  R2  d92. 

We  let  els  e2,  and  e3  denote  the  unit  vectors  in  the  coordinate  directions. 
These  three  vectors  form  a  basis  of  the  tangent  space. 
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Problem  13.  Find  the  values  of  the  forms  dxu  dx2 ,  and  dx3  on  the  vectors  e,,  e2,  and  e3. 


Answer,  dxfa)  =  1  /s/E-,,  the  rest  are  zero.  In  particular,  for  cartesian  coordinates  dx(tx )  = 
dy(ey)  =  dz(ez)  =  1 ;  for  cylindrical  coordinates  dr(er)  =  d:(ez)  =  1  and  dcp{e,„)  =  1/r  (Figure 
145),  for  spherical  coordinates  dR(eK)  =  1,  d<p(tv)  =  1/R  cos  0  and  d0(e9)  =  1  jR. 

The  metric  and  orientation  on  the  manifold  M  furnish  the  tangent  space 
to  M  at  every  point  with  the  structure  of  an  oriented  euclidean  three-dimen¬ 
sional  space.  In  terms  of  this  structure,  we  can  talk  about  scalar,  vector,  and 
triple  scalar  products. 


Problem  14.  Calculate  [e,,  e2],  (eR,  e„),  and  (e;,  e^). 


Answer.  e3,  0,  1. 


In  an  oriented  euclidean  three-space  every  vector  A  corresponds  to  a 
1-form  a>l  and  a  2-form  ,  defined  by  the  conditions 

Qji(^)  =  (A,  §)  Tl)  =  (A,  ll),  5,  T)  G  K3. 

The  correspondence  between  vector  fields  and  forms  does  not  depend  on 
the  system  of  coordinates,  but  only  on  the  euclidean  structure  and  orienta¬ 
tion.  Therefore,  every  vector  field  A  on  our  manifold  M  corresponds  to  a 
differential  1-form  col  on  M  and  a  differential  2-form  col  on  Af. 
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The  formulas  for  changing  from  fields  to  forms  and  back  have  a  different 
form  in  each  coordinate  system.  Suppose  that  in  the  coordinates  xu  x2,  and 
x3  described  above,  the  vector  field  has  the  form 

A  =  A^c j  +  A 2&2  T  A3Q3 

(the  components  A{  are  smooth  functions  on  M).  The  corresponding  1-form 
coi  decomposes  over  the  basis  dxz,  and  the  corresponding  2-form  over  the 
basis  dxt  a  dxj. 


Problem  15.  Given  the  components  of  the  vector  field  A,  find  the  decompositions  of  the  1-form 
colA  and  the  2-form  <d\. 

Solution.  We  have  =  (A,  e,)  =  A,.  Also,  (a,  dxi  +  a2  dx2  +  a3  dx3)(el)  = 

a,  dxt(ej)  =  axl^/Wx.  From  this  we  get  that  at  =  Als/lTu  so  that 

coA  =  A 1X/1T1  dx,  4-  dx2  +  A3s/e~3  dx3. 

In  the  same  way,  we  have  w\(t2,  e3)  =  (A,  e2,  e3)  =  At.  Also, 

1 

(a(  dx2  a  dx3  +  a2  dx3  a  dxi  +  a3  dxt  a  dx2)(t 2,  e3)  =  —  . 

v/eT^ 


Hence,  a,  =  AlxjE2E3,  i.e„ 

a>A  =  A , x/e j  E3  dx2  a  dx3  +  A2^Je3E1  dx3  a  dx j  4-  A3x/e{E2  a  dx2 . 

In  particular,  in  cartesian,  cylindrical,  and  spherical  coordinates  on  R3  the  vector  field 
A  =  Axex  +  Ayey  +  Azez  =  Arer  +  A^v  +  Azez  =  /l^e*  4  A„ev  +  A„e„ 
corresponds  to  the  1-form 

col  =  Axdx  +  Ay  dy  +  A,  dz  =  Ar  dr  +  rAv  dtp  +  Azdz  =  ARdR  4  R  cos  9AV  dtp  4-  RAe  dO 
and  the  2-form 

a)A  =  Ax  dy  a  dz  +  Ay  dz  a  dx  +  Az  dx  a  dy 
—  rAr  dtp  a  dz  4-  A  dz  a  dr  +  rAz  dr  a  dtp 
=  R 2  cos  6Ar  dtp  a  dO  +  RAV  dO  a  dR  +  R  cos  BA.dR  a  dtp. 


An  example  of  a  vector  field  on  a  manifold  M  is  the  gradient  of  a  function 
/:  M  -*  IR.  Recall  that  the  gradient  of  a  function  is  the  vector  field  grad / 
corresponding  to  the  differential: 

"grad /  =  df  i.e.,  df($)  =  (grad/,  §) 


Problem  16.  Find  the  components  of  the  gradient  of  a  function  in  the  basis  e,,  e2,  e3. 

Solution.  We  have  df  =  (df/dx,)dx,  4-  ( dfjdx2)dx2  +  ( df/dx3)dx3 .  By  the  problem  above 


grad  /  = 


3  Sf 

t/e~i  Sxi  Cl 


4 


3  3f 


e2 


4 


1  3f 

jE3Sx3 


e3‘ 
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In  particular,  in  cartesian,  cylindrical,  and  spherical  coordinates 


df  df  8f  8f 
grad  /  =  —  ex  +  —  ey  +  f  ez  =  f  er  + 


1  df 


Sf 


dx 


dy 


1 


dz 

Sf 


dr 


r  dip  v  dz 


=  y_ _ c 

8R  **  +  R  cos  9  dip**  T  R  dB  C" 


1  Cf 


35  Integration  of  differential  forms 

We  define  here  the  concepts  of  a  chain,  the  boundary  of  a  chain,  and  the  integration  of  a  form 
over  a  chain. 

The  integral  of  a  differential  form  is  a  higher-dimensional  generalization  of  such  ideas  as  the 
flux  of  a  fluid  across  a  surface  or  the  work  of  a  force  along  a  path. 


A  The  integral  of  a  1-form  along  a  path 

We  begin  by  integrating  a  1-form  to1  on  a  manifold  M.  Let 

y:  [0  <  t  <  1]  -►  M 

be  a  smooth  map  (the  “path  of  integration”).  The  integral  of  the  form 
gj1  on  the  path  y  is  defined  as  a  limit  of  Riemann  sums.  Every  Riemann  sum 
consists  of  the  values  of  the  form  a)1  on  some  tangent  vectors  (Figure  146): 

I*"1  =  lim  £  w1^)- 

A~*0  i  =  1 

The  tangent  vectors  ^  are  constructed  in  the  following  way.  The  interval 
0  <  t  <  1  is  divided  into  parts  A;:  tt  <  t  <  ti+  j  by  the  points  t{.  The  interval 
A,  can  be  looked  at  as  a  tangent  vector  A*  to  the  t  axis  at  the  point  t,-.  Its 
image  in  the  tangent  space  to  M  at  the  point  y(t,)  is 

^.  =  ^|,(A,.)gTMt(i,. 

The  sum  has  a  limit  as  the  largest  of  the  intervals  A,  tends  to  zero.  It  is 
called  the  integral  of  the  1-form  col  along  the  path  y. 

The  definition  of  the  integral  of  a  fe-form  along  a  k-dimensional  surface 
follows  an  analogous  pattern.  The  surface  of  integration  is  partitioned  into 


ti 

Figure  146  Integrating  a  1-form  along  a  path 
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Figure  147  Integrating  a  2-form  over  a  surface 

small  curvilinear  /c-dimensional  parallelepipeds  (Figure  147);  these  paral¬ 
lelepipeds  are  replaced  by  parallelepipeds  in  the  tangent  space.  The  sum  of  the 
values  of  the  form  on  the  parallelepipeds  in  the  tangent  space  approaches 
the  integral  as  the  partition  is  refined.  We  will  first  consider  a  particular  case. 

B  The  integral  of  a  k-form  on  oriented  euclidean  space  [Rfc 

Let  x i,  ,  xk  be  an  oriented  coordinate  system  on  (Rk.  Then  every  fc-form 
on  IR*  is  proportional  to  the  form  dxt  a  a  dxk ,  i.e.,  it  has  the  form 
a)k  =  <p(x)dxi  a  ■  ■  ■  a  dxk,  where  (p(x )  is  a  smooth  function. 

Let  D  be  a  bounded  convex  polyhedron  in  [Rk  (Figure  148).  By  definition, 
the  integral  of  the  form  a>k  on  D  is  the  integral  of  the  function  q> : 

I  o)k  —  I  <p(xyx1?  ...,dxk, 

Jd  Jd 

where  the  integral  on  the  right  is  understood  to  be  the  usual  limit  of  Riemann 
sums. 

Such  a  definition  follows  the  pattern  outlined  above,  since  in  this  case  the 
tangent  space  to  the  manifold  is  identified  with  the  manifold. 


Problem  1.  Show  that  JD  cok  depends  linearly  on  cuk. 

Problem  2.  Show  that  if  we  divide  D  into  two  distinct  polyhedra  Dt  and  D2 ,  then 

I  wk  =  I  a/  +  [ 

In  the  general  case  (a  fc-form  on  an  n-dimensional  space)  it  is  not  so  easy 
to  identify  the  elements  of  the  partition  with  tangent  parallelepipeds;  we  will 
consider  this  case  below. 


Figure  148  Integrating  a  fc-form  in  /c-dimensional  space 
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C  The  behavior  of  differential  forms  under  maps 

Let  f\M  -*  N  be  a  differentiable  map  of  a  smooth  manifold  M  to  a  smooth 
manifold  N,  and  let  at  be  a  differential  fc-form  on  N  (Figure  149).  Then,  a 
well-defined  /c-form  arises  also  on  M :  it  is  denoted  by  f*a>  and  is  defined  by 
the  relation 


for  any  tangent  vectors  . . . ,  TMX.  Here  is  the  differential  of  the 
map  f  In  other  words,  the  value  of  the  form  f*co  on  the  vectors  2^, . . . ,  i;k  is 
equal  to  the  value  of  co  on  the  images  of  these  vectors. 


R 

M  A'  t 


Figure  149  A  form  on  N  induces  a  form  on  M. 

Example.  If  >'  =  f(xu  x2)  =  x2t  +  x\  and  «  =  dy,  then 

f*a)  =  2xi  dxt  4-  2x2  dx2  ■ 


Problem  3.  Show  that  f*to  is  a  fc-form  on  M. 

Problem  4.  Show  that  the  map /*  preserves  operations  on  forms: 

/*(Ai«|  +  A2«2)  =  A]/*(wj)  +  A2/*(«2), 
/*(w,  A  C02)  =  (/*«,)  A  (/Vi). 

Problem  5.  Let  g  :  L  —>  M  be  a  differentiable  map.  Show  that  (fg)*  =  g*f  *. 


Problem  6.  Let  D{  and  D2  be  two  compact,  convex  polyhedra  in  the  oriented  /c-dimensional 
space  H?*  and  /:Dj  -*  D2  a  differentiable  map  which  is  an  orientation-preserving  diffeomor- 
phism55  of  the  interior  of  onto  the  interior  of  D2.  Then,  for  any  differential  /c-form  cak  on  D2, 


f  =  f 

"  Di 


Hint,  This  is  the  change  of  variables  theorem  for  a  multiple  integral: 
%i,  ■  •  • ,  yJ 

(p{.y(x»axl  ■■■ax„  = 

'd2 


l 


a,  d(x,,  ...,x„) 


<p(y(x))dx ,  ■  •  ■  dx„  =  J  <p(y)dy1  ■  ■  ■  dy„ . 


55  i.e.,  one-to-one  with  a  differentiable  inverse. 
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D  Integration  of  a  k-form  on  an  n-dimensional  manifold 

Let  co  be  a  differential  fc-form  on  an  n-dimensional  manifold  M.  Let  D  be  a 
bounded  convex  fc-dimensional  polyhedron  in  ^-dimensional  euclidean 
space  [Rfc  (Figure  150).  The  role  of  “path  of  integration”  will  be  played  by  a 


Figure  150  Singular  /c -dimensional  polyhedron 

^-dimensional  cell56  o'  of  M  represented  by  a  triple  a  =  (D,  f  Or)  consisting 
of 

1.  a  convex  polyhedron  D  c  IRk, 

2.  a  differentiable  map  /:£)->  M,  and 

3.  an  orientation  on  IRk,  denoted  by  Or. 


Definition.  The  integral  of  the  k- form  o)  over  the  ^-dimensional  cell  a  is  the 
integral  of  the  corresponding  form  over  the  polyhedron  D 


Problem  7.  Show  that  the  integral  depends  linearly  on  the  form: 

41  A2  Ct>2  =  ^1  I  ^1  ^2  I  ^2  • 

ft  *0  *o 

The  /c -dimensional  cell  which  differs  from  a  only  by  the  choice  of  orienta¬ 
tion  is  called  the  negative  of  a  and  is  denoted  by  —  a  or  —  l  cr  (Figure  151). 


Figure  151 


Problem  8.  Show  that,  under  a  change  of  orientation,  the  integral  changes  sign: 


/. 


a ). 


56  The  cell  a  is  usually  called  a  singular  k-dimensional  polyhedron. 
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E  Chains 

The  set  f(D )  is  not  necessarily  a  smooth  submanifold  of  M.  It  could  have 
“self-intersections”  or  “folds”  and  could  even  be  reduced  to  a  point.  How¬ 
ever,  even  in  the  one-dimensional  case,  it  is  clear  that  it  is  inconvenient  to 
restrict  ourselves  to  contours  of  integration  consisting  of  one  piece:  it  is 
useful  to  be  able  to  consider  contours  consisting  of  several  pieces  which  can 
be  traversed  in  either  direction,  perhaps  more  than  once.  The  analogous 
concept  in  higher  dimensions  is  called  a  chain. 

Definition.  A  chain  of  dimension  kona  manifold  M  consists  of  a  finite  collection 
of  fc-dimensional  oriented  cells  ffx,  . . . ,  ffr  in  M  and  integers  mx,  ...,  mr, 
called  multiplicities  (the  multiplicities  can  be  positive,  negative,  or  zero). 
A  chain  is  denoted  by 

ck  =  m^j  +  •  •  •  +  mror. 

We  introduce  the  natural  identifications 

/tiiff  +  m2o  =  (ml  +  m2)o 

m^y  +  m2o  2  —  m2o2  +  miai  Oc  =  0  ck  +  0  =  ck. 

Problem  9.  Show  that  the  set  of  all  k -chains  on  M  forms  a  commutative  group  if  we  define  the 
addition  of  chains  by  the  formula 

(micr,  +  ■  ■  ■  +  mrar)  +  fin' \<j\  +  •  •  •  +  m'ryri)  =  mxol  +  •  •  •  +  mTar  +  m\o\  +  •  •  •  +  m'r[a'rr 

F  Example:  the  boundary  of  a  polyhedron 

Let  D  be  a  convex  oriented  fc-dimensional  polyhedron  in  /c-dimensional 
euclidean  space  IR*.  The  boundary  of  D  is  the  (k  —  l)-chain  dD  on  IR*  defined 
in  the  following  way  (Figure  152). 

The  cells  of  the  chain  dD  are  the  ( k  —  l)-dimensional  faces  £>,  of  the 
polyhedron  D ,  together  with  maps/:  D,  Rfc  embedding  the  faces  in  Rk  and 
orientations  Or;  defined  below ;  the  multiplicities  are  equal  to  1 : 

3D  =  X<7{  ff;  =  (£>,,/,  Or,). 

Rule  of  orientation  of  the  boundary.  Let  elt . . . ,  efe  be  an  oriented  frame  in 
IRk.  Let  Dj  be  one  of  the  faces  of  D.  We  choose  an  interior  point  of  Dt  and  there 


Figure  152  Oriented  boundary 
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construct  a  vector  n  outwardly  normal  to  the  polyhedron  D.  An  orienting 
frame  for  the  face  D{  will  be  a  frame  f1} . . . ,  fk_  j  on  D,  such  that  the  frame 
(n,  fl5 . . . ,  f^_  j)  is  oriented  correctly  (i.e.,  the  same  way  as  the  frame  ,  ek). 

The  boundary  of  a  chain  is  defined  in  an  analogous  way.  Let  o  =  ( D,f  Or) 
be  a  -dimensional  cell  in  the  manifold  M.  Its  boundary  da  is  the  (k  —  1) 
chain:  da  =  £  at  consisting  of  the  cells  cr,  =  ( D{ ,  f,  Or;),  where  the  D{  are 
the  ( k  —  l)-dimensional  faces  of  D,  Or,-  are  orientations  chosen  by  the  rule 
above,  and  f  are  the  restrictions  of  the  mapping  /:  D  M  to  the  face  £>,. 

The  boundary  dck  of  the  A: -dimensional  chain  ck  in  M  is  the  sum  of  the 
boundaries  of  the  cells  of  ck  with  multiplicities  (Figure  153): 

8ck  —  dim^i  +  •  •  ■  +  mrar )  =  mk  dak  +  •  •  •  +  mr  dar. 
Obviously,  dck  is  a  (A:  —  l)-chain  on  M. 51 


Figure  153  Boundary  of  a  chain 

Problem  10.  Show  that  the  boundary  of  the  boundary  of  any  chain  is  zero:  ddck  =  0. 

Hint.  By  the  linearity  of  S  it  is  enough  to  show  that  ddD  =  0  for  a  convex  polyhedron  D.  It 
remains  to  verify  that  every  (k  —  2)-dimensional  face  of  D  appears  in  ddD  twice,  with  opposite 
signs.  It  is  enough  to  prove  this  for  k  =  2  (planar  cross-sections). 

G  The  integral  of  a  form  over  a  chain 

Let  cok  be  a  fc-form  on  M,  and  ck  a  k -chain  on  M,  ck  =  £  mta The  integral 
of  the  form  cok  over  the  chain  ck  is  the  sum  of  the  integrals  on  the  cells,  counting 
multiplicities: 


Problem  11.  Show  that  the  integral  depends  linearly  on  the  form: 


J  W*!  +  =  J  +  J 


<»k2  ■ 


Problem  1 2.  Show  that  integration  of  a  fixed  form  cok  on  chains  ck  defines  a  homomorphism  from 
the  group  of  chains  to  the  tine. 


57  We  are  taking  k  >  1  here.  One-dimensional  chains  are  included  in  the  general  scheme  if  we 
make  the  following  definitions:  a  zero-dimensional  chain  consists  of  a  collection  of  points  with 
multiplicities;  the  boundary  of  an  oriented  interval  AB  is  B  —  A  (the  point  B  with  multiplicity  1 
and  A  with  multiplicity  —  1);  the  boundary  of  a  point  is  empty. 
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Example  1.  Let  M  be  the  plane  {(p,  q)},  vf  the  form  pdq,  and  the  chain  consisting  of  one  cell  <j 
with  multiplicity  1 : 

[0  <  r  <  2ji]  -U  (p  =  cos  f,  q  =  sin  f). 

Then|Cl  pdq  =  n.  In  general,  if  a  chain  represents  the  boundary  of  a  region  G  (Figure  154),  then 
Jci  pdq  is  equal  to  the  area  of  G  with  sign  +  or  -  depending  on  whether  the  pair  of  vectors 
(outward  normal,  oriented  boundary  vector)  has  the  same  or  opposite  orientation  as  the  pair 
(p  axis,  q  axis). 


tf 


Figure  154  The  integral  of  the  form  p  dq  over  the  boundary  of  a  region  is  equal  to  the 
area  of  the  region. 


Example  2.  Let  M  be  the  oriented  three-dimensional  euclidean  space  R3.  Then  every  1-form  on 
M  corresponds  to  some  vector  field  A  (<w‘  =  cof),  where 

wa(5)  =  (A>  £>• 

The  integral  of  cu i  on  a  chain  representing  a  curve  /  is  called  the  circulation  of  the  field  A 
over  the  curve  I: 

=  I  (A ,dl). 


f  =  fi 

•'c  1  Jt 


Every  2-form  on  M  also  corresponds  to  some  field  A  (co2  =  col,  where  wa(^  q)  =  (A,  %,  r|)). 
The  integral  of  the  form  c o\  on  a  chain  c2  representing  an  oriented  surface  S  is  called  the 
flux  of  the  field  A  through  the  surface  S : 

Ui=  f  (A,  dn). 

JCZ  Js 

Problem13.  Find  the  flux  of  the  field  A  =  (l//?2)efioverthesurfaceofthespherex2  +  y2  +  z2  = 
1,  oriented  by  the  vectors  e*,  ey  at  the  point  z  =  l.  Find  the  flux  of  the  same  field  over  the  surface 
of  the  ellipsoid  (x2/a2)  +  ( y2jb 2)  +  z2  =  1  oriented  the  same  way. 

Hint.  Cf.  Section  36H. 


Problem  14.  Suppose  that,  in  the  2u-dimensional  space  R"  =  {(p,, . . .  ,p„;  qu  . . . ,  q„)},  we  are 
given  a  2-chain  c2  representing  a  two-dimensional  oriented  surface  S  with  boundary  /.  Find 

dpt  a  dqt  +  ■  ■  •  +  dp„  a  dqn  and  p^qx  +  ■  ■  •  +  p„dq„. 

Answer.  The  sum  of  the  oriented  areas  of  the  projection  of  S  on  the  two-dimensional  coordinate 
planes  pt ,  . 
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36  Exterior  differentiation 

We  define  here  exterior  differentiation  of  k -forms  and  prove  Stokes’  theorem:  the  integral  of  the 
derivative  of  a  form  over  a  chain  is  equal  to  the  integral  of  the  form  itself  over  the  boundary  of 
the  chain. 

A  Example:  the  divergence  of  a  vector  field 

The  exterior  derivative  of  a  /c-form  wona  manifold  M  is  a  (k  +  l)-form  da> 
on  the  same  manifold.  Going  from  a  form  to  its  exterior  derivative  is  analo¬ 
gous  to  forming  the  differential  of  a  function  or  the  divergence  of  a  vector 
field.  We  recall  the  definition  of  divergence. 


Figure  155  Definition  of  divergence  of  a  vector  field 


Let  A  be  a  vector  field  on  the  oriented  euclidean  three-space  R3,  and  let  S 
be  the  boundary  of  a  parallelepiped  n  with  edges  2;, ,  %2 > and  at  the  vertex  x 
(Figure  155).  Consider  the  (“outward”)  flux  of  the  field  A  through  the 
surface  S : 


F(  n)  = 


If  the  parallelepiped  n  is  very  small,  the  flux  F  is  approximately  propor¬ 
tional  to  the  product  of  the  volume  of  the  parallelepiped,  V  =  (£,,  2;2>  £3), 
and  the  “source  density”  at  the  point  x.  This  is  the  limit 


lim 

£— *0 


F(eYl) 
£3  V 


where  ell  is  the  parallelepiped  with  edges  e£2>  £^3.  This  limit  does  not 
depend  on  the  choice  of  the  parallelepiped  II  but  only  on  the  point  x,  and  is 
called  the  divergence,  div  A,  of  the  field  A  at  x. 

To  go  to  higher-dimensional  cases,  we  note  that  the  “flux  of  A  through  a 
surface  element”  is  the  2-form  which  we  called  ofi .  The  divergence,  then, 
is  the  density  in  the  expression  for  the  3-form 

to3  =  div  A  dx  a  dy  a  dz, 

«3(4i^2^3)  =  divA-F(^,^,^X 


characterizing  the  “sources  in  an  elementary  parallelepiped.” 
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The  exterior  derivative  do)k  of  a  k-form  cok  on  an  n-dimensional  manifold 
M  may  be  defined  as  the  principal  multilinear  part  of  the  integral  of  cok  over 
the  boundaries  of  ( k  +  l)-dimensional  parallelepipeds. 

B  Definition  of  the  exterior  derivative 

We  define  the  value  of  the  form  dco  on  k  +  1  vectors  !;  l5...,  +1  tangent  to  Af 
at  x.  To  do  this,  we  choose  some  coordinate  system  in  a  neighborhood  of  x 
on  M,  i.e.,  a  differentiable  map  /  of  a  neighborhood  of  the  point  0  in  euclidean 
space  IR"  to  a  neighborhood  of  x  in  M  (Figure  156). 


Figure  156  The  curvilinear  parallelepiped  n. 


The  pre-images  of  the  vectors  Hjj,  . . . ,  4k  +  i  e  TMX  under  the  differential 
off  lie  in  the  tangent  space  to  IR"  at  0.  This  tangent  space  can  be  naturally 
identified  with  IR",  so  we  may  consider  the  pre-images  to  be  vectors 

W . 

We  take  the  parallelepiped  n*  in  IR"  spanned  by  these  vectors  (strictly 
speaking,  we  must  look  at  the  standard  oriented  cube  in  1  and  its  linear 
map  onto  n*,  taking  the  edges  eu  . . . ,  efc+1  to  ,  . . . ,  £?+ 1,  as  a  (k  +  1)- 
dimensional  cell  in  IR").  The  map  /  takes  the  parallelepiped  n*  to  a  (k  +  1)- 
dimensional  cell  on  M  (a  “curvilinear  parallelepiped”).  The  boundary  of  the 
cell  n  is  a  k-chain,  dfl  Consider  the  integral  of  the  form  c ok  on  the  boundary 
dn  ofri: 


Example  .  We  will  call  a  smooth  function  (p  :  M  ->  R  a  0-form  on  M.  The  integral  of  the  0-form  <p 
on  the  0-chain  c0  =  X  (where  the  mt  are  integers  and  the  At  points  of  M)  is 

\  <P  =  £  mMAd. 

^  CO 

Then  the  definition  above  gives  the  “increment”  F(^,)  =  (p(x{ )  -  tp(x)  (Figure  157)  of  the 
function  cp,  and  the  principal  linear  part  of  F(^)  at  0  is  simply  the  differential  of  tp. 

Problem  1.  Show  that  the  function  F(§„  is  skew-symmetric  with  respect  to 

It  turns  out  that  the  principal  (k  +  l)-linear  part  of  the  “increment” 
F($!, . . . ,  £k  +  1)  is  an  exterior  (k  +  l)-form  on  the  tangent  space  TMX  to  M 
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Figure  157  The  integral  over  the  boundary  of  a  one-dimensional  parallelepiped  is  the 
change  in  the  function. 

at  x.  This  form  does  not  depend  on  the  coordinate  system  that  was  used  to 
define  the  curvilinear  parallelepiped  n.  It  is  called  the  exterior  derivative,  or 
differential,  of  the  form  cok  (at  the  point  x)  and  is  denoted  by  do/. 

C  A  theorem  on  exterior  derivatives 

Theorem.  There  is  a  unique  ( k  +  1  /form  Q  on  TMX  which  is  the  principal 
(k  +  1  )-linear  part  at  0  of  the  integral  over  the  boundary  of  a  curvilinear 
parallelepiped,  F(^,  ...,  \k+ ,);  i.e., 

(1)  F(s^, . . . ,  *4+ 1)  =  e*+  ...,$k  +  l)  +  o  (ek  +  1)  (e  ->  0). 

The  form  Q  does  not  depend  on  the  choice  of  coordinates  involved  in  the 
definition  of  F.  If,  in  the  local  coordinate  system  xu  . . . ,  x„  on  M,  the  form 
a/  is  written  as 

^  S  dx^  A  •  •  •  A  dxik, 

then  Q  is  written  as 

(2)  Q  =  dcok  =  £  daiu_.  'ik  a  dxit  a  •  •  •  a  dxik . 

We  will  carry  out  the  proof  of  this  theorem  for  the  case  of  a  form  a/  — 
a(xl5  x2)dx1  on  the  xl5  x2  plane.  The  proof  in  the  general  case  is  entirely 
analogous,  but  the  calculations  are  somewhat  longer. 

We  calculate  F(^,  ij),  i.e.,  the  integral  of  co1  on  the  boundary  of  the  paral¬ 
lelogram  II  with  sides  2;  and  n  and  vertex  at  0  (Figure  158).  The  chain  dll  is 

*2 


Xl 

Figure  158  Theorem  on  exterior  derivatives 
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given  by  the  mappings  of  the  interval  0  <  t  <  1  to  the  plane  t  -» t,t,  t  -* 
2;  +  ift,  t  -*  itf,  and  t  -*  t|  +  with  multiplicities  1, 1,  -  1,  and  —  1.  Therefore, 

f  w1  =  f  MO  -  a($t  +  n)Ki  -  MO  -  a(w  +  S)>i  dt 

Jen  Jo 

where  ^  =  dxx(£),  =  dx j(n),  £■>,  =  dx2(Z>),  and  t]2  =  dx2(r\)  are  the 

components  of  the  vectors  ^  and  t|.  But 

$(X  _  .y  ^  ,, 

a(Z>t  +  i|)  -  a&t)  =  —  ^ — ^2  +  0(S  >  n  ) 

(JX± 

(the  derivatives  are  taken  at  xy  =  x2  =  0).  In  the  same  way 

a(r\t  +  -  a(i\t)  =  ~  Ci  +  ^  h  +  0(£2,  t|2). 

By  using  these  expressions  in  the  integral,  we  find  that 

F&n)  =  I  —  -r—  (£2tii  —  Ztfi)  +  o(^2,n2). 

Jan  0x2 

The  principal  bilinear  part  of  F,  as  promised  in  (1),  turns  out  to  be  the  value 
of  the  exterior  2-form 

_  da 

12  =  - —  dx2  a  dxl 

UX2 

on  the  pair  of  vectors  tj.  Thus  the  form  obtained  is  given  by  formula  (2), 
since 


da  a  dx ,  =  - —  dx,  a  dxx  +  dx2  a  dxy  =  - —  dx2  a  dxv 
dxy  dx2  ox2 

Finally,  if  the  coordinate  system  xls  x2  is  changed  to  another  (Figure  159), 
the  parallelogram  n  is  changed  to  a  nearby  curvilinear  parallelogram  II',  so 
that  the  difference  in  the  values  of  the  integrals,  Jen  col  —  Jan.  col  will  be 
small  of  more  than  second  order  (prove  it !).  □ 


Vi 
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Problem  2.  Carry  out  the  proof  of  the  theorem  in  the  general  case. 
Problem  3.  Prove  the  formulas  for  differentiating  a  sum  and  a  product: 


d(  a>!  +  co2)  =  da)  j  +  da>2- 


and 


d(a>k  a  v>‘)  =  d<uk  a  co1  +  ( —  1  )ka)k  a  do)1 , 

Problem  4.  Show  that  the  differential  of  a  differential  is  equal  to  zero:  dd  =  0. 

Problem  5.  Let  /:  M  -*  :V  be  a  smooth  map  and  co  a  k-form  on  N.  Show  that/*(c/co)  =  c/(  /  *co). 


D  Stokes’  formula 

One  of  the  most  important  corollaries  of  the  theorem  on  exterior  derivatives 
is  the  Newton-Leibniz-Gauss-Green-Ostrogradskii-Stokes-Poincare  for¬ 
mula: 


where  c  is  any  ( k  +  l)-chain  on  a  manifold  M  and  co  is  any  k-form  on  M. 

To  prove  this  formula  it  is  sufficient  to  prove  it  for  the  case  when  the  chain 
consists  of  one  cell  a.  We  assume  first  that  this  cell  a  is  given  by  an  oriented 
parallelepiped  fl  c  Rk+1  (Figure  160). 


1 

n/ 

111 

— i — 

_ 

Figure  160  Proof  of  Stokes’  formula  for  a  parallelepiped 


We  partition  n  into  Af*+1  small  equal  parallelepipeds  H,  similar  to  Fl. 
Then,  clearly, 

Nk  +  i 

where  F{  =  I  co. 

hn, 


*  Nk  + 1  j. 

\  co  =  £  Fit  where  F,  = 

Jen  i  —  i  *^1 


By  formula  (1)  we  have 

Fi  =  dco(?1,...,U+i)  +  o(N-<*+l>), 

where  4i,  •  •  •  1  are  the  edges  n.-  But  Z*=  i  ’  •  •  • »  4i+  i)  is  a 

Riemann  sum  for  jn  dco.  It  is  easy  to  verify  that  o (N~ik+1))  is  uniform,  so 

Nk  *  '  Nk  +  l  * 

lim  X  Fi  =  ,im  X  <*«(&.••■.&+,)  =  daj- 

V  -»  co  i  =  1  \  oo  i  =  1  J  n 


192 


36:  Exterior  differentiation 


Finally,  we  obtain 


f  =  Z  Fi  =  lim  Z  Fi  =  f  dco- 

•sdn  n  — *  go  Jn 

Formula  (3)  follows  automatically  from  this  for  any  chain  whose  polyhedra 
are  parallelepipeds. 

To  prove  formula  (3)  for  any  convex  polyhedron  D,  it  is  enough  to  prove 
it  for  a  simplex,58  since  D  can  always  be  partitioned  into  simplices  (Figure 
161): 

D  =  dD  =  Y.dDi- 


Figure  161 


Division  of  a  convex  polyhedron  into  simplices 


Figure  162  Proof  of  Stokes’  formula  for  a  simplex 

We  will  prove  formula  (3)  for  a  simplex.  Notice  that  a  /c-dimensional 
oriented  cube  can  be  mapped  onto  a  /c-dimensional  simplex  so  that : 

1.  The  interior  of  the  cube  goes  diffeomorphically,  with  its  orientation 
preserved,  onto  the  interior  of  the  simplex ; 

2.  The  interiors  of  some  ( k  —  l)-dimensional  faces  of  the  cube  go  diffeo¬ 
morphically,  with  their  orientations  preserved,  onto  the  interiors  of  the 
faces  of  the  simplex;  the  images  of  the  remaining  ( k  —  l)-dimensional 
faces  of  the  cube  lie  in  the  ( k  —  2)-dimensional  faces  of  the  simplex. 

For  example,  for  k  —  2  such  a  map  of  the  cube  0  <  xl5  x2  <  1  onto  the 
triangle  is  given  by  the  formula  =  xl5  y2  —  xxx2  (Figure  162).  Then, 

58  A  two-dimensional  simplex  is  a  triangle,  a  three-dimensional  simplex  is  a  tetrahedron,  a 
fc-dimensional  simplex  is  the  convex  hull  of  k  +  1  points  in  R"  which  do  not  lie  in  any  k  —  1- 
dimensional  plane. 

Example:  {x  e  R*:  x,  >  0  and  i  x,  <  1 }. 
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formula  (3)  for  the  simplex  follows  from  formula  (3)  for  the  cube  and  the 
change  of  variables  theorem  (cf.  Section  35C). 


Example  1.  Consider  the  1-form 

ca1  =  Pi  dq{  +  ■■■  +  p„  dq„  =  p  dq 

on  R2n  with  coordinates pu  . . . ,  pn,  qx,  .~-,q„- Then  dco1  =  dpx  a  dqx  +  ■  ■  • 
+  dp„  a  dq„  =  dp  a  dq,  so 


dp  a  dq  = 


In  particular,  if  c2  is  a  closed  surface  {dc2  =  0),  then  jjC2  dp  a  dq  =  0. 


E  Example  2 —  Vector  analysis 

In  a  three-dimensional  oriented  riemannian  space  M,  every  vector  field  A 
corresponds  to  a  1-form  coA  and  a  2-form  .  Therefore,  exterior  differentia¬ 
tion  can  be  considered  as  an  operation  on  vectors. 

Exterior  differentiation  of  0-forms  (functions),  1 -forms,  and  2-forms  cor¬ 
respond  to  the  operations  of  gradient,  curl,  and  divergence  defined  by  the 
relations 

df  =  c^rad/  dco{  -  032curlA  deal  =  (div  A)m3 
(the  form  co3  is  the  volume  element  on  M).  Thus,  it  follows  from  (3)  that 

f(y)  -f(x)  =  jgrad/dl  if dl  =  y  -  x 
j^A  dl  =  JJ  curl  A  •  dn  if  dS  =  l 
jj  A  dn  =  JjJ  (div  A )ct>3  if  dD  =  S. 


Problem  6.  Show  that 

div[A,  B]  =  (curl  A,  B)  -  (curl  B,  A), 
curl  a  A  =  [grad  a.  A]  +  a  curl  A, 
div  aA  =  (grad  a.  A)  +  a  div  A. 

Hint.  By  the  formula  for  differentiating  the  product  of  forms, 

d(C0[A,B])  =  A  cok)  =  dcol  a  ffli  -  A  da>B- 

Problem  7.  Show  that  curl  grad  =  div  curl  =  0. 

Hint,  dd  =  0. 
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F  Appendix  1 :  Vector  operations  in  triply  orthogonal  systems 

Let  xu  x2,  x3  be  a  triply  orthogonal  coordinate  system  on  M,  ds2  = 
Et  dxj  +  E2  dx 2  -I-  E3  dxj  and  ef  the  coordinate  unit  vectors  (cf.  Section 
34E). 


Problem  8.  Given  the  components  of  a  vector  field  A  =  /^e,  +  A2e2  +  /43e3,  find  the  compo¬ 
nents  of  its  curl. 

Solution.  According  to  Section  34E 


=  A, v/f^  dxt  +  A2grE2  dx2  +  A3s/e3  <7.y,. 


Therefore, 


doA  = 


?A3s/E  3  cA2J  £- 


dx-, 


dx. 


dx2  a  dx 3  +  •  •  ■ 


According  to  Section  34E,  we  have 


1 

curl  A  = - 


dA^jE, 

dx3 


e,  +  ••■ 


In  particular,  in  cartesian,  cylindrical,  and  spherical  coordinates  on 


curl  A  = 


'dAz 

Jy 


dAy 

dz 


e*  + 


dAx 

dz 


5AZ 

dx 


e>  + 


dAy 

~d^ 


dAx 

dy . 


e 


z 


1  fdAz 
r\dq> 


dAr 

dtp  , 


lez 


1  (dAe  dAv  cos  0\  1  (dAR  dRAA  1  (8RA„, 

R  cos  0  d0  /R  +  R  \~d0  ~  dR  /e"  +  R  \  dR 


1  aM 

cos  9  dtp  / 


Problem  9.  Find  the  divergence  of  the  field  A  =  A{et  +  A2e2  +  A3e3. 
Solution.  a>\  =  A1Je2E3  dx2  a  dx3  +  •  •  • .  Therefore, 

g 

doJ X  =  -  (A ,  yj E2  E3)  dxl  a  dx2  a  dx3  +  ■  -  - . 
cbc. 

By  the  definition  of  divergence, 


da>l  =  div  A xjElE1E3  dx1 


a  dx2  a  dx3 . 


This  means 


div  A  = _ : 

y  EIE1E  3 


+ 


dx , 
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In  particular,  in  cartesian,  cylindrical,  and  spherical  coordinates  on  1R3 : 


1  /dR2  cos  0AK  dRAv  SR  cos 
R2  cos  0  \  dR  dq>  30  J 

Problem  10,  The  Laplace  operator  on  M  is  the  operator  A  =  div  grad.  Find  its  expression  in  the 
coordinates  x,. 


dA,  dAv  £AZ  1 
dlvA  =  ^  +  ^  +  ^% 


drAr  dAz 

dr  +  da>)  +  dz 


Answer. 


In  particular,  on 


A/ 


i  r  d 


dx  j 


e2e,  df  \  + 

£i  dxj 


d2f  d2f  d2f 


A/  =  71  +  +  711  ~  71 2 


dx2  dy2  dz 2 


S2f  1  dj_  \ _dff  d2f 


dr2  r  dr  r2  d(p2  dz2 


I 


R2  cos  0 


'  d 

dR 


8R)  d<p  \cos  0  d<p)  d0  \  d0j 


G  Appendix  2:  Closed  forms  and  cycles 

The  flux  of  an  incompressible  fluid  (without  sources)  across  the  boundary 
of  a  region  D  is  equal  to  zero.  We  will  formulate  a  higher-dimensional 
analogue  to  this  obvious  assertion.  The  higher-dimensional  analogue  of  an 
incompressible  fluid  is  called  a  closed  form.  The  field  A  has  no  sources  if 
div  A  =  0. 


Definition.  A  differential  form  «  on  a  manifold  M  is  closed  if  its  exterior 
derivative  is  zero:  dco  =  0. 


In  particular,  the  2-form  corresponding  to  a  field  A  without  sources 
is  closed.  Also,  we  have,  by  Stokes’  formula  (3): 


Theorem.  The  integral  of  a  closed  form  cok  over  the  boundary  of  any  (k  -F  1)- 
dimensional  chain  ck  + 1  is  equal  to  zero  : 


cok  =  0  if  da/  =  0. 


Problem  11.  Show  that  the  differential  of  a  form  is  always  closed. 


On  the  other  hand,  there  are  closed  forms  which  are  not  differentials.  For 
example,  take  for  M  the  three-dimensional  euclidean  space  U3  without  O: 
M  =  U3  -  0,  with  the  2-form  being  the  flux  of  the  field  A  =  (l//?2)e* 
(Figure  163).  It  is  easy  to  convince  oneself  that  div  A  =  0,  so  that  our  2-form 
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co2a  is  closed.  At  the  same  time,  the  flux  over  any  sphere  with  center  0  is  equal 
to  47t.  We  will  show  that  the  integral  of  the  differential  of  a  form  over  the 
sphere  must  be  zero. 

Definition.  A  cycle  on  a  manifold  M  is  a  chain  whose  boundary  is  equal  to 
zero. 

The  oriented  surface  of  our  sphere  can  be  considered  to  be  a  cycle.  It 
immediately  follows  from  Stokes’  formula  (3)  that 

Theorem.  The  integral  of  a  differential  over  any  cycle  is  equal  to  zero : 

I  dof  =  0  ifdck+i  =  0. 

Thus,  our  2-form  a)\  is  not  the  differential  of  any  1-form. 

The  existence  of  closed  forms  on  M  which  are  not  differentials  is  related 
to  the  topological  properties  of  M.  One  can  show  that  every  closed  fc-form 
on  a  vector  space  is  the  differential  of  some  ( k  -  l)-form  (Poincare’s  lemma). 


Problem  12.  Prove  Poincare’s  lemma  for  1-forms. 

Hint.  Consider  at1  =  cpC*,). 

Problem  13.  Show  that  in  a  vector  space  the  integral  of  a  closed  form  over  any  cycle  is  zero. 
Hint.  Construct  a  (k  +  l)-chain  whose  boundary  is  the  given  cycle  (Figure  164). 
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Namely,  for  any  chain  c  consider  the  “cone  over  c  with  vertex  0.  If  we  denote  the  operation 
of  constructing  a  cone  by  p,  then 

8  o  p  +  p  o  8  =  1  (the  identity  map). 

Therefore,  if  the  chain  c  is  closed,  8(pc)  =  c. 


Problem.  Show  that  every  closed  form  on  a  vector  space  is  an  exterior  derivative. 

Hint.  Use  the  cone  construction.  Let  co*  be  a  differential  /c-form  on  R".  We  define  a  (k  -  1)- 
form  (the  “co-cone  over  co”)  pwl  in  the  following  way:  for  any  chain  ct_ , 

pa/  =  f  a/. 

•%-,  Vk-i 

It  is  easy  to  see  that  the  ( k  -  l)-form  pcok  exists  and  is  unique;  its  value  on  the  vectors 
I;  t . tangent  to  R"  at  x,  is  equal  to 

(P^W^l!  •  •  •  >  l)  =  |0  •  •  •  >  l)^f' 

It  is  easy  to  see  that 

d  o  p  +  p°  d  =  1  (the  identity  map). 


Therefore,  if  the  form  <nk  is  closed,  d(pcok)  =  cok. 


Problem.  Let  X  be  a  vector  field  on  M  and  co  a  differential  fc-form.  We  define  a  differential 
(k  —  Inform  ix(o  (the  interior  derivative  of  co  by  X)  by  the  relation 

•  •  •  !  l)  =  •  •  •  I  5*-  i)- 

Prove  the  homotopy  formula 

ixd  +  dix  =  Lx, 

where  Lx  is  the  differentiation  operator  in  the  direction  of  the  field  X. 

[The  action  of  Lx  on  a  form  is  defined,  using  the  phase  flow  {g1}  of  the  field  X,  by  the  relation 


(Lxro)ft)  =  jt 


1=0 


Lx  is  called  the  Lie  derivative  or  fisherman's  derivative:  the  flow  carries  all  possible  differential- 
geometric  objects  past  the  fisherman,  and  the  fisherman  sits  there  and  differentiates  them.] 
Hint.  We  denote  by  H  the  “homotopy  operator”  associating  to  a  k -chain  y:o  -*  M  the 
(k  +  l)-chain  Hy :  (/  x  a)  -»  M  according  to  the  formula  (Hy)(t,  x)  =  g‘y(x)  (where  I  -  [0,  1]). 
Then 

gly  -  y  =  d(Hy)  +  H(8y). 


Problem.  Prove  the  formula  for  differentiating  a  vector  product  on  three-dimensional  euclidean 
space  (or  on  a  riemannian  manifold); 

curl  [a,  b]  =  {a,  b}  +  a  div  b  -  b  div  a 

(where  {a,  b}  =  L.b  is  the  Poisson  bracket  of  the  vector  fields,  cf.  Section  39). 

Hint.  If  t  is  the  volume  element,  then 

=  di*>*z  div  a  =  di,r  and  {a,b}  =  L.b; 

by  using  these  relations  and  the  fact  that  dr  =  0,  it  is  easy  to  derive  the  formula  for  curl[a,  b]  from 
the  homotopy  formula. 
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H  Appendix  3 :  Cohomology  and  homology 

The  set  of  all  /c-forms  on  M  is  a  vector  space,  the  closed  k- forms  a  sub¬ 
space  and  the  differentials  of  ( k  -  Informs  a  subspace  of  the  subspace  of 
closed  forms.  The  quotient  space 


(closed  forms) 
(differentials) 


Hk(M,  U) 


is  called  the  k-th  cohomology  group  of  the  manifold  M.  An  element  of  this 
group  is  a  class  of  closed  forms  differing  from  one  another  only  by  a  differ¬ 
ential. 


Problem  14.  Show  that  for  the  circle  S1  we  have  R)  =  K. 

The  dimension  of  the  space  Hk(M,  R)  is  called  the  k-th  Betti  number  of  M. 
Problem  15.  Find  the  first  Betti  number  of  the  torus  T2  =  S1  x  S'. 


The  flux  of  an  incompressible  fluid  (without  sources)  over  the  surfaces  of 
two  concentric  spheres  is  the  same.  In  general,  when  integrating  a  closed  form 


Figure  165  Homologous  cycles 


over  a  fc-dimensional  cycle,  we  can  replace  the  cycle  with  another  one  pro¬ 
vided  that  their  difference  is  the  boundary  of  a  (k  +  l)-chain  (Figure  165): 


if  a  —  b  —  dck+ 1  and  do/  =  0. 

Poincare  called  two  such  cycles  a  and  b  homologous. 

With  a  suitable  definition59  of  the  group  of  chains  on  a  manifold  M  and  its 


59  For  this  our  group  {ct}  must  be  made  smaller  by  identifying  pieces  which  differ  only  by  the 
choice  of  parametrization  f  or  the  choice  of  polyhedron  D.  In  particular,  we  may  assume  that 
D  is  always  one  and  the  same  simplex  or  cube.  Furthermore,  we  must  take  every  degenerate 
k-cell  (D,  /,  Or )  to  be  zero,  i.e.,  (D,  f.  Or)  =  Oif  /  =  f2  /j,  where/! :  D  -*  D‘  and  D‘  has  dimension 
smaller  than  k. 
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subgroups  of  cycles  and  boundaries  (i.e.,  cycles  homologous  to  zero),  the 
quotient  group 


(cycles) 

(boundaries) 


=  H*(M) 


is  called  the  k-th  homology  group  of  M. 

An  element  of  this  group  is  a  class  of  cycles  homologous  to  one  another. 
The  rank  of  this  group  is  also  equal  to  the  k-th  Betti  number  of  M  (“De 

Rham’s  Theorem  ”). 
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Symplectic  manifolds 


A  symplectic  structure  on  a  manifold  is  a  closed  nondegenerate  differential 
2-form.  The  phase  space  of  a  mechanical  system  has  a  natural  symplectic 
structure. 

On  a  symplectic  manifold,  as  on  a  riemannian  manifold,  there  is  a  natural 
isomorphism  between  vector  fields  and  1 -forms.  A  vector  field  on  a  sym¬ 
plectic  manifold  corresponding  to  the  differential  of  a  function  is  called  a 
hamiltonian  vector  field.  A  vector  field  on  a  manifold  determines  a  phase 
flow,  i.e.,  a  one-parameter  group  of  diffeomorphisms.  The  phase  flow  of  a 
hamiltonian  vector  field  on  a  symplectic  manifold  preserves  the  symplectic 
structure  of  phase  space. 

The  vector  fields  on  a  manifold  form  a  Lie  algebra.  The  hamiltonian 
vector  fields  on  a  symplectic  manifold  also  form  a  Lie  algebra.  The  operation 
in  this  algebra  is  called  the  Poisson  bracket. 


37  Symplectic  structures  on  manifolds 

We  define  here  symplectic  manifolds,  hamiltonian  vector  fields,  and  the  standard  symplectic 
structure  on  the  cotangent  bundle. 


A  Definition 

Let  M2"  be  an  even-dimensional  differentiable  manifold.  A  symplectic 
structure  on  Mln  is  a  closed  nondegenerate  differential  2-form  co2  on  M2": 

dco2  =  0  and  #  0  3t|:  w2(£,  ip  ^  0  (§,  n  e  TMJ. 

The  pair  (M2n,  co2)  is  called  a  symplectic  manifold. 
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Example.  Consider  the  vector  space  R2"  with  coordinates  pf,  and  let  to2  —  £  dpj  a  dq,. 

Problem.  Verify  that  (R2”,  co2)  is  a  symplectic  manifold.  For  n  =  1  the  pair  (R2,  «2)  is  the  pair 
(the  plane,  area). 

The  following  example  explains  the  appearance  of  symplectic  manifolds 
in  dynamics.  Along  with  the  tangent  bundle  of  a  differentiable  manifold,  it  is 
often  useful  to  look  at  its  dual— the  cotangent  bundle. 

B  The  cotangent  bundle  and  its  symplectic  structure 

Let  V  be  an  n-dimensional  differentiable  manifold.  A  1-form  on  the  tangent 
space  to  V  at  a  point  x  is  called  a  cotangent  vector  to  V  at  x.  The  set  of  all 
cotangent  vectors  to  V  at  x  forms  an  n-dimensional  vector  space,  dual  to 
the  tangent  space  TVX.  We  will  denote  this  vector  space  of  cotangent  vectors 
by  T*VX  and  call  it  the  cotangent  space  to  V  at  x. 

The  union  of  the  cotangent  spaces  to  the  manifold  at  all  of  its  points  is 
called  the  cotangent  bundle  of  V  and  is  denoted  by  T*V.  The  set  T*V  has  a 
natural  structure  of  a  differentiable  manifold  of  dimension  2 n.  A  point  of 
T*V  is  a  1-form  on  the  tangent  space  to  V  at  some  point  of  V.  If  q  is  a  choice 
of  n  local  coordinates  for  points  in  V,  then  such  a  form  is  given  by  its  n  com¬ 
ponents  p.  Together,  the  2  n  numbers  p,  q  form  a  collection  of  local  coordinates 
for  points  in  T*V. 

There  is  a  natural  projection f:  T*V  ->  V  (sending  every  1-form  on  TVX  to 
the  point  x).  The  projection  /  is  differentiable  and  surjective.  The  pre-image 
of  a  point  xe  V  under / is  the  cotangent  space  T*VX. 

Theorem.  The  cotangent  bundle  T*V  has  a  natural  symplectic  structure.  In  the 
local  coordinates  described  above,  this  symplectic  structure  is  given  by  the 
formula 

co2  =  dp  a  dq  =  dpi  a  dqi  +  •  ■  ■  +  dp„  a  dq„. 

Proof.  First,  we  define  a  distinguished  1-form  on  T*V .  Let  2;  e  T(T*V)P  be 
a  vector  tangent  to  the  cotangent  bundle  at  the  point  peT*Vx  (Figure  166). 
The  derivative  /„ :  T(T*V)  -*  TFofthe  natural  projection/:  T*V  ->  V  takes 
^  to  a  vector  tangent  to  V  at  x.  We  define  a  1-form  co1  on  T*V  by  the 
relation  cu1^)  =  p(/*^).  In  the  local  coordinates  described  above,  this  form 
is  co1  —  p  dq.  By  the  example  in  A,  the  closed  2-form  w2  =  dco1  is  non¬ 
degenerate.  ^ 


Remark.  Consider  a  lagrangian  mechanical  system  with  configuration  manifold  V  and 
lagrangian  function  L.  It  is  easy  to  see  that  the  lagrangian  “generalized  velocity”  q  is  a  tan¬ 
gent  vector  to  the  configuration  manifold  V,  and  the  “generalized  momentum  p  —  SL/d q 
is  a  cotangent  vector.  Therefore,  the  “  p,  q  ”  phase  space  of  the  lagrangian  system  is  the  cotangent 
bundle  of  the  configuration  manifold.  The  theorem  above  shows  that  the  phase  space  of  a 
mechanical  problem  has  a  natural  symplectic  manifold  structure. 
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T*  Vx 

/ 

PQ 


* 


T*V 


Figure  166  The  1-form  p  dq  on  the  cotangent  bundle 


Problfm.  Show  that  the  Legendre  transform  does  not  depend  on  the  coordinate  system:  it 
takes  a  function  L:  Tt '  R  on  the  tangent  bundle  to  a  function  H:  T*V  -tffion  the  cotangent 
bundle. 

C  Hamiltonian  vector  fields 

A  riemannian  structure  on  a  manifold  establishes  an  isomorphism  between 
the  spaces  of  tangent  vectors  and  1 -forms.  A  symplectic  structure  establishes 
a  similar  isomorphism. 

Definition.  To  each  vector  tangent  to  a  symplectic  manifold  (M2",  to2)  at 
the  point  x,  we  associate  a  1-form  co \  on  TMX  by  the  formula 

o^(n)  =  «2(n,  §)  Vt|  e  TMX . 

Problem.  Show  that  the  correspondence  ^  -*  is  an  isomorphism  between  the  2n-dimensional 
vector  spaces  of  vectors  and  of  1 -forms. 

Example.  In  R2"  =  {(p,  q)}  we  will  identify  vectors  and  1  -forms  by  using  the  euclidean  structure 
(x,  x)  =  p2  +  q2.  Then  the  correspondence  %  -►  determines  a  transformation  1R2"  ->  IR2". 


Problem.  Calculate  the  matrix  of  this  transformation  in  the  basis  p,  q. 

Answer.  /  0  E\ 

0/ 

We  will  denote  by  /  the  isomorphism  /:  T*MX  ->  TMX  constructed  above. 
Now  let  H  be  a  function  on  a  symplectic  manifold  M2".  Then  dH  is  a  differ¬ 
ential  1-form  on  M,  and  at  every  point  there  is  a  tangent  vector  to  M  as¬ 
sociated  to  it.  In  this  way  we  obtain  a  vector  field  /  dH  on  M. 

Definition.  The  vector  field  /  dH  is  called  a  hamiltonian  vector  field',  H  is 
called  the  hamiltonian  function. 
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Example.  If  M2n  =  R2n 
canonical  equations: 


{(p,  q)},  then  we  obtain  the  phase  velocity  vector  field  of  Hamilton’s 

dH  ,  .  a/f 

*  =  Mtf(x)~P=  and  q  =  ^ 


38  Hamiltonian  phase  flows  and  their  integral 
invariants 

Liouville’s  theorem  asserts  that  the  phase  flow  preserves  volume.  Poincare  found  a  whole 
series  of  differential  forms  which  are  preserved  by  the  hamiltoman  phase  flow. 


A  Hamiltonian  phase  flows  preserve  the  symplectic  structure 

Let  (M2n,  (o 2)  be  a  symplectic  manifold  and  H:  M2n  -»Ra  function.  Assume 
that  the  vector  field  1  dH  corresponding  to  H  gives  a  1 -parameter  group  of 
diffeomorphisms  g M2n  -*>  M2n: 


d_ 

It 


9 '* 

1  =  0 


1  dH(\). 


The  group  gx  is  called  the  hamiltonian  phase  flow  with  hamiltoman  function  H. 


Theorem.  A  hamiltonian  phase  flow  preserves  the  symplectic  structure  : 

(g')*co2  =  co2. 

In  the  case  n  =  1,  Af2"  =  M2,  this  theorem  says  that  the  phase  flow  g' 
preserves  area  (Liouville’s  theorem). 

For  the  proof  of  this  theorem,  it  is  useful  to  introduce  the  following  nota¬ 
tion  (Figure  167). 

Let  M  be  an  arbitrary  manifold,  c  a  k -chain  on  M  and  gl:  M  -*  M  a  one- 
parameter  family  of  differentiable  mappings.  We  will  construct  a  (k  +  1)- 
chain  Jc  on  M,  which  we  will  call  the  track  of  the  chain  c  under  the  homotopy 

gf,  0  <  r  <  r.  . 

Let  (£>,/,  Or)  be  one  of  the  cells  in  the  chain  c.  To  this  cell  will  be  associated 

a  cell  (D\f  \  Or')  in  the  chain  Jc,  where  D’  =  I  x  D  is  the  direct  product  of 
the  interval  0  <  t  <  t  and  D;  the  mapping  /':  D‘^M  is  obtained  from 
f.D^M  by  the  formula  /'(t,  x)  =  gf(x)\  and  the  orientation  Or'  of  the 


Figure  167  Track  of  a  cycle  under  homotopy 
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space  Rk  + 1  containing  D'  is  given  by  the  frame  e0,  e ek,  where  e0  is  the 
unit  vector  of  the  t  axis,  and  eI( . . . ,  ek  is  an  oriented  frame  for  D. 

We  could  say  that  Jc  is  the  chain  swept  out  by  c  under  the  homotopy  g\ 
0  <  t  <  t.  The  boundary  of  the  chain  Jc  consists  of  “end-walls”  made  up 
of  the  initial  and  final  positions  of  c,  and  “side  surfaces”  filled  in  by  the 
boundary  of  c. 

It  is  easy  to  verify  that  under  the  choice  of  orientation  made  above, 

(1)  d(Jck )  =  gzck  -  ck  -Jdck. 


Lemma.  Let  y  be  a  1-chain  in  the  symplectic  manifold  ( M2n ,  to2).  Let  g1  be  a 
phase  flow  on  M  with  hamiltonian  function  H.  Then 


Proof.  It  is  sufficient  to  consider  a  chain  y  with  one  cell  f:  [0,  1]  -*■  M.  We 
introduce  the  notation 

f'{s,  t)  =  g'f  (s),  £  =  ^  and  i\  =  TMr^t). 

By  the  definition  of  the  integral 


r  i 


o 


w2(^,  i\)dt  ds. 
Jo 


But  by  the  definition  of  the  phase  flow,  i|  is  a  vector  (at  the  point  f'(s,  t ))  of 
the  hamiltonian  field  with  hamiltonian  function  H.  By  definition  of  a  hamil¬ 
tonian  field,  (o2( r|)  =  dH(Q.  Thus 


or 


)jy 


□ 


Corollary.  If  the  chain  y  is  closed  (dy  =  0),  then  J7>  to2  —  0. 

Proof.  dH  =  H  =  0.  □ 

Proof  of  the  theorem.  We  consider  any  2-chain  c.  We  have 


(1  since  of2  is  closed,  2  by  Stokes’  formula,  3  by  formula  (1),  4  by  the  corollary 
above  with  y  =  3c).  Thus  the  integrals  of  the  form  a>2  on  any  chain  c  and  on 
its  image  gxc  are  the  same.  □ 


Problem.  Is  every  one-parameter  group  of  diffeomorphisms  of  M2"  which  preserves  the  sym¬ 
plectic  structure  a  hamiltonian  phase  flow? 

Hint.  Cf.  Section  40. 
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B  Integral  invariants 

Let  g  \  M  ->  M  be  a  differentiable  map. 


Definition.  A  differential  k- form  (o  is  called  an  integral  invariant  of  the  map  g 
if  the  integrals  of  to  on  any  fc-chain  c  and  on  its  image  under  g  are  the  same: 


Example.  If  M  =  IR2  and  cu1  =  dp  a  dq  is  the  area  element,  then  to2  is  an  integral  invariant  of 
any  map  g  with  jacobian  1. 

Problem.  Show  that  a  form  «/  is  an  integral  invariant  of  a  map  g  if  and  only  if  g*a>k  =  a>k. 

Problem.  Show  that  if  the  forms  a>k  and  oj1  are  integral  invariants  of  the  map  g ,  then  the  form 
u>k  a  a>l  is  also  an  integral  invariant  of  g. 

The  theorem  in  subsection  A  can  be  formulated  as  follows: 

Theorem.  The  form  of2  giving  the  symplectic  structure  is  an  integral  invariant 
of  a  hamiltonian  phase  flow. 

We  now  consider  the  exterior  powers  of  co2, 

(to2)2  =  OJ2  A  (O2  (to2)3  =  ffl2  Aft)2  AW2,.... 

Corollary.  Each  of  the  forms  (oj2)2,  (to2)3,  (to2)4,  ...is  an  integral  invariant  of  a 
hamiltonian  phase  flow. 

Problem.  Suppose  that  the  dimension  of  the  symplectic  manifold  (M2n,  a>2)  is  2 n.  Show  that 
(to2/  =  0  for  k  >  n,  and  that  (to2)"  is  a  nondegenerate  2n-form  on  M2n. 

We  define  a  volume  element  on  M2"  using  (co2)".  Then,  a  hamiltonian 
phase  flow  preserves  volume,  and  we  obtain  Liouville’s  theorem  from  the 
corollary  above. 

Example.  Consider  the  symplectic  coordinate  space  M2n  =  IR2"  =  {(P,  <l)}, 
co2  =  dp  a  dq  =  £  dPi  a  dqi.  In  this  case  the  form  (to2)*  is  proportional  to 

the  form 

a)2k  =  £  dpix  a  •  •  •  a  dpik  a  dqix  a  •  •  •  a  dqik. 

ii  <  <  ik 

The  integral  of  < o2k  is  equal  to  the  sum  of  the  oriented  volumes  of  projections 
onto  the  coordinate  planes  (pit,  ...,pik,  qix, . . . ,  qik). 

A  map  g  :  IR2"  -*■  IR2"  is  called  canonical  if  it  has  a>2  as  an  integral  invariant. 
A  canonical  map  is  generally  called  a  canonical  transformation.  Each  of  the 
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forms  to4,  to6, . . . ,  ft)2"  is  an  integral  invariant  of  every  canonical  transforma¬ 
tion.  Therefore,  under  a  canonical  transformation ,  the  sum  of  the  oriented  areas 
of  projections  onto  the  coordinate  planes  (pir  . . . ,  pik,  qh, . . . ,  qik),  1  <  k  <  n, 
is  preserved.  In  particular,  canonical  transformations  preserve  volume. 

The  hamiltonian  phase  flow  given  by  the  equations  p  =  —8H/d q,  q  = 
8H/d p  consists  of  canonical  transformations  g '. 

The  integral  invariants  considered  above  are  also  called  absolute  integral 
invariants. 


Definition.  A  differential  /c-form  co  is  called  a  relative  integral  invariant  of  the 
map  g :  M  -*  M  if  J9C  co  =  jc  co  for  every  closed  k -chain  c. 

Theorem.  Let  co  be  a  relative  integral  invariant  of  a  map  g.  Then  dco  is  an  ab¬ 
solute  integral  invariant  of  g. 

Proof.  Let  c  be  a  k  +  1 -chain.  Then 

(  dco  =(  (o=  (  a)  =  f  a)  -  f  dco. 

Jc  Jdc  Jgdc  Jdgc  J  gc 

(1  and  4  are  by  Stokes’  formula,  2  by  the  definition  of  relative  invariant,  and 
3  by  the  definition  of  boundary).  □ 


Example.  A  canonical  map  g:  R2"  -»  R2"  has  the  1-form 

n 


a>1  =  p  dq  =  Y,  Pidqi  as  a  relative  integral  invariant. 

I  =  1 

In  fact,  every  closed  chain  c  on  R2"  is  the  boundary  of  some  chain  a,  and  we  find 
f  to1  ±  f  a,1  ±  I"  co1  =  dco1  =  f  dco1  =  C  —  L1; 

"  gc  ^g8a  J 8g<r  get  *  <t  ^ 8tr  *c 

(1  and  6  are  by  definition  of  a,  2  by  definition  of  <3,  3  and  5  by  Stokes’  formula,  and  4  since  g 
is  canonical  and  dco 1  =  d(p  dq)  =  dq  a  dq  —  w2). 


Problem.  Let  dv>k  be  an  absolute  integral  invariant  of  the  map  g  :  M  -*  M.  Does  it  follow  that 
ojk  is  a  relative  integral  invariant? 


Answer.  No,  if  there  is  a  closed  fc-chain  on  M  which  is  not  a  boundary. 

C  The  law  of  conservation  of  energy 

Theorem.  The  function  H  is  a  first  integral  of  the  hamiltonian  phase  flow  with 
hamiltonian  function  H. 

Proof.  The  derivative  of  H  in  the  direction  of  a  vector  q  is  equal  to  the  value 
of  dH  on  t|.  By  definition  of  the  hamiltonian  field  q  =  /  dH  we  find 

dH(i\)  =  co2(q,  /  dH)  —  o»2(q,  q)  =  0.  □ 


Problem.  Show  that  the  1-form  dH  is  an  integral  invariant  of  the  phase  flow  with  hamiltonian 
function  H. 
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39  The  Lie  algebra  of  vector  fields 

Every  pair  of  vector  fields  on  a  manifold  determines  a  new  vector  field,  called  their  Poisson 
bracket.60  The  Poisson  bracket  operation  makes  the  vector  space  of  infinitely  differentiable 
vector  fields  on  a  manifold  into  a  Lie  algebra. 

A  Lie  algebras 

One  example  of  a  Lie  algebra  is  a  three-dimensional  oriented  euclidean 
vector  space  equipped  with  the  operation  of  vector  multiplication.  The 
vector  product  is  bilinear,  skew-symmetric,  and  satisfies  the  Jacobi  identity 

[[A,  Bl  C]  +  [IB,  Cl  A]  +  [[C,  Al  B]  =  0. 

Definition.  A  Lie  algebra  is  a  vector  space  L,  together  with  a  bilinear  skew- 
symmetric  operation  L  x  L  — ►  L  which  satisfies  the  Jacobi  identity. 

The  operation  is  usually  denoted  by  square  brackets  and  called  the 
commutator. 

Problem.  Show  that  the  set  ofn  x  n  matrices  becomes  a  Lie  algebra  if  we  define  the  commutator 
by  1 A.B]  =  AB  -  BA. 


B  Vector  fields  and  differential  operators 

Let  M  be  a  smooth  manifold  and  A  a  smooth  vector  field  on  M :  at  every 
point  xeM  we  are  given  a  tangent  vector  A(x)e  TMX.  With  every  such 
vector  field  we  associate  the  following  two  objects: 

1.  The  one-parameter  group  of  diffeomorphisms  or  flow  A1:  M  -*  M  for  which 
A  is  the  velocity  vector  field  (Figure  168)  :61 


d_ 

dt 


A'x  -  A(x). 

t= o 


2.  The  first-order  differential  operator  LA .  We  refer  here  to  the  differentiation 
of  functions  in  the  direction  of  the  field  A:  for  any  function  (p:  M  ->■  R 
the  derivative  in  the  direction  of  A  is  a  new  function  LA  (p,  whose  value 
at  a  point  x  is 


(La  </>)(*)  = 


dt 


(p(A'x). 

1  =  0 


60  Or  Lie  bracket  [Trans,  note], 

61  By  theorems  of  existence,  uniqueness,  and  differentiability  in  the  theory  of  ordinary  dif¬ 
ferential  equations,  the  group  A‘  is  defined  if  the  manifold  M  is  compact.  In  the  general  case 
the  maps  A'  are  defined  only  in  a  neighborhood  of  x  and  only  for  small  t\  this  is  enough  for  the 
following  constructions. 
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A 

Figure  168  The  group  of  difleomorphisms  given  by  a  vector  field 

Problem.  Show  that  the  operator  LA  is  linear: 

+  A.2ip2)  =  XxLA<p !  +  \2LAip2  (Aj,  Aj  £  IR). 

Also,  prove  Leibniz’s  formula  LA{<px<p2)  —  <piLAip2  +  ip2LAipx. 

Example.  Let  (x,,  ,  x„)  be  local  coordinates  on  M.  In  this  coordinate  system  the  vector  A(x) 

is  given  by  its  components  (A,(x), .  - . ,  A„(x));  the  flow  A'  is  given  by  the  system  of  differential 
equations 


x,  =  A,(x), . . .,  x„  =  A„(x) 

and,  therefore,  the  derivative  of  <p  =  <p(xt . x„)  in  the  direction  A  is 

dip  dip 

LA<p  =  A,  ~  4-  ■  •  •  +  A„— . 

<3*i  5xn 

We  could  say  that  in  the  coordinates  (x,, . . . ,  x„)  the  operator  LA  has  the  form 

d  d 

=  A  |  —  +  +  Am  — . 

dxt  dx„ 

this  is  the  general  form  of  a  first-order  linear  differential  operator  on  coordinate  space. 

Problem.  Show  that  the  correspondences  between  vector  fields  A,  flows  A\  and  differentiations 
La  are  one-to-one. 

C  The  Poisson  bracket  of  vector  fields 

Suppose  that  we  are  given  two  vector  fields  A  and  B  on  a  manifold  M.  The 
corresponding  flows  A‘  and  Bs  do  not,  in  general,  commute:  A'BS  ^  BSA‘ 
(Figure  169), 

Problem.  Find  an  example. 

Solution.  The  fields  A  =  e,,  B  =  x,e2  on  the  (x,,  x2)  plane. 


Figure  169  Non-commutative  flows 
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To  measure  the  degree  of  noncommutativity  of  the  two  flows  A‘  and  Bs  we 
consider  the  points  A*Bsx  and  BsA'x.  In  order  to  estimate  the  difference 
between  these  points,  we  compare  the  value  at  them  of  some  smooth  function 
c p  on  the  manifold  M.  The  difference 

A (t;s;x)  =  <p(A'Bsx)  -  p(BM'x) 

is  clearly  a  differentiable  function  which  is  zero  for  s  =  0  and  for  t  =  0. 
Therefore,  the  first  term  different  from  0  in  the  Taylor  series  in  s  and  t  of  A 
at  0  contains  st,  and  the  other  terms  of  second  order  vanish.  We  will  calculate 
this  principal  bilinear  term  of  A  at  0. 


Lemma  1.  The  mixed  partial  derivative  d2A/ds  dt  at  0  is  equal  to  the  com¬ 
mutator  of  differentiation  in  the  directions  A  and  B: 


{<p(A'Bsx)  -  (p(BM'x)}  =  (LBLAcp  -  LALB(p){x). 

5—1  =  0 

Proof.  By  the  definition  of  LA , 


-  (p(A(Bsx)  =  (LA<p)(Bsx). 
ft  (=0 

If  we  denote  the  function  LA  tp  by  i {/,  then  by  the  definition  of  LB 


d2 

ds  dt 


_a 

8s 


\}/(Bsx)  =  (Lb<A)x. 

s  =  0 


Thus, 

(p(AfBsx)  =  (LBLA(p)x.  □ 

s  =  r  =  0 

We  now  consider  the  commutator  of  differentiation  operators  LBLA  - 
La  Lb.  At  first  glance  this  is  a  second- order  differential  operator. 


a2 

dsdt 


Lemma  2.  The  operator  LBLA  -  LALB  is  a  first-order  linear  differential 
operator. 

Proof.  Let  (Au  ...,A„)  and  (Bj, . . . ,  B„)  be  the  components  of  the  fields 
A  and  B  in  the  local  coordinate  system  (xl5 . . . ,  x„)  on  M.  Then 

"  f)  n  f)  "  8  A-  C  A  82(p 

L‘L'«  =  = J, Bi ThBiA‘^; 


If  we  subtract  LALB(p,  the  term  with  the  second  derivatives  of  <p  vanishes, 
and  we  obtain 


"  /  8  A  j 

(LbLa  -  LALB)<p  =  _ 


aBA  8(p 
1  dxj  8Xj 


□ 
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Since  every  first-order  linear  differential  operator  is  given  by  a  vector 
field,  our  operator  LBLA  —  LALB  also  corresponds  to  some  vector  field  C. 

Definition.  The  Poisson  bracket  or  commutator  of  two  vector  fields  A  and 
B  on  a  manifold  M62  is  the  vector  field  C  for  which 

Lc  —  LBLA  —  LALB. 

The  Poisson  bracket  of  two  vector  fields  is  denoted  by 

C  =  [A,  B], 


Problem,  Suppose  that  the  vector  fields  A  and  B  are  given  by  their  components  A„  B;  in  coor¬ 
dinates  x,  ,  Find  the  components  of  the  Poisson  bracket. 

Solution.  In  the  proof  of  Lemma  2  we  proved  the  formula 


[A,  B](  =  I  B 

t  =  1 


-  d, 


dB, 

cx, 


Problem.  Let  A!  be  the  linear  vector  field  of  velocities  of  a  rigid  body  rotating  with  angular 
velocity  w,  around  0.  and  A2  the  same  thing  with  angular  velocity  co2  Find  the  Poisson  bracket 
[A„A2]. 


D  The  Jacobi  identity 

Theorem.  The  Poisson  bracket  makes  the  vector  space  of  vector  fields  on  a 
manifold  M  into  a  Lie  algebra. 

Proof.  Linearity  and  skew-symmetry  of  the  Poisson  bracket  are  clear.  We 
will  prove  the  Jacobi  identity.  By  definition  of  Poisson  bracket,  we  have 

^[[A,B],C]  =  —  T[A  B]Lc 

=  LcLb  La  —  Lc  La  Lb  +  LaLbLc  —  LbLaLc. 

There  will  be  12  terms  in  all  in  the  sum  L[tA  B]  C]  +  Lt[B ,C],a]  +  E[[C,A],b]- 
Each  term  appears  in  the  sum  twice,  with  opposite  signs.  □ 

E  A  condition  for  the  commutativity  of flows 
Let  A  and  B  be  vector  fields  on  a  manifold  M. 

Theorem.  The  two  flows  A1  and  Bs  commute  if  and  only  if  the  Poisson  bracket 
of  the  corresponding  vector  fields  [A,  B]  is  equal  to  zero. 

Proof.  If  A'BS  =  BsAt,  then  [A,  B]  =  0  by  Lemma  1.  If  [A,  B]  =  0,  then, 
by  Lemma  1, 

( p(A*Bsx )  —  <p(BsA'x)  =  o(s2  +  t2),  s  -»  0  and  t  -+  0 

62  In  many  books  the  bracket  is  given  the  opposite  sign.  Our  sign  agrees  with  the  sign  of  the 
commutator  in  the  theory  of  Lie  groups  (cf.  subsection  F). 
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for  any  function  (p  at  any  point  x.  We  will  show  that  this  implies  (p(A‘Bsx )  = 
(p(BsA'x)  for  sufficiently  small  s  and  t.  If  we  apply  this  to  the  local  coordinates 
(<p  =  xt,  . . . ,  (p  =  x„),  we  obtain  AfBs  =  BSA‘. 

Consider  the  rectangle  0  <  t  <  r0,  0  <  s  <  s0  (Figure  170)  in  the  r,  s-plane.  To  every  path 
going  from  (0, 0)  to  (t0,  s0)  and  consisting  of  a  finite  number  of  intervals  in  the  coordinate  direc¬ 
tions,  we  associate  a  product  of  transformations  of  the  flows  A1  and  Bs.  Namely,  to  each  interval 
fi  <  t  <  t2  we  associate  and  to  each  interval  <  s  <  s2  we  associate  BS2  s';  the  trans¬ 

formations  are  applied  in  the  order  in  which  the  intervals  occur  in  the  path,  beginning  at  (0,  0). 
For  example,  the  sides  (0  <  t  <  t0,  s  =  0)  and  (r  =  f„,  0  <  s  <  s0)  corresponds  to  the  product 
BsoA’°,  and  the  sides  (t  =  0,  0  <  s  <  s0)  and  (s  =  s0, 0  <  f  <  r0)  to  the  product  A,oBS0. 


s 


Figure  170  Proof  of  the  commutativity  of  flows 


Figure  171  Curvilinear  quadrilateral  PySza 

In  addition,  we  associate  to  each  such  path  in  the  (r,  s)-plane  a  path  on  the  manifold  M 
starting  at  the  point  x  and  composed  of  trajectories  of  the  flows  A‘  and  Bs  (Figure  171).  If  a 
path  in  the  (t,  s)-plane  corresponds  to  the  product  AhBs'  ■  ■  AtnB5",  then  on  the  manifold  M 
the  corresponding  path  ends  at  the  point  A‘'BS'  ■  ■  ■  A'"BSnx.  Our  goal  will  be  to  show  that  all 
these  paths  actually  terminate  at  the  one  point  A^B^x  =  BSDA'ax. 

We  partition  the  intervals  0  <  t  <  r0  and  0  <  s  <  s0  into  N  equal  parts,  so  that  the  whole 
rectangle  is  divided  into  N2  small  rectangles.  The  passage  from  the  sides  (0, 0)  —  (f0,  0)  —  (t0,  s0) 
to  the  sides  (0,  0)  -  (0,  s0)  -  (t0,  s0)  can  be  accomplished  in  N2  steps,  in  each  of  which  a  pair 
of  neighboring  sides  of  a  small  rectangle  is  exchanged  for  the  other  pair  (Figure  172).  In  general, 
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N 


Figure  172  Going  from  one  pair  of  sides  to  the  other. 

this  small  rectangle  corresponds  to  a  non-closed  curvilinear  quadrilateral  Pydr.x  on  the  manifold 
M  (Figure  171).  Consider  the  distance61  between  its  vertices  x  and  /?  corresponding  to  the  largest 
values  of  s  and  r.  As  we  saw  earlier.  p(a,  p)  <  C,JV~3  (where  the  constant  C,  >  0  does  not 
depend  on  N).  Using  the  theorem  of  the  differentiability  of  solutions  of  differential  equations 
with  respect  to  the  initial  data,  it  is  not  difficult  to  derive  from  this  a  bound  on  the  distance 
between  the  ends  a'  and  P'  of  the  paths  xdypp'  and  xdeaa'  on  M:p( a',  P  )  <  C2N  \  where  the 
constant  C2  >  0  again  does  not  depend  on  N.  But  we  broke  up  the  whole  journey  from  B’°A‘°x 
to  At0B*°x  into  /V2  such  pieces.  Thus,  p(A'0B,0x,  B‘°A,0x)  <  N2C2N~3  VAT  Therefore, 
T°Bsox  =  B‘°A,°x.  □ 

F  Appendix:  Lie  algebras  and  Lie  groups 

A  Lie  group  is  a  group  G  which  is  a  differentiable  manifold,  and  for  which  the 
operations  (product  and  inverse)  are  differentiable  maps  G  x  G  -*  G  and 
G-^G. 

The  tangent  space,  TGe,  to  a  Lie  group  G  at  the  identity  has  a  natural 
Lie  algebra  structure;  it  is  defined  as  follows: 

For  each  tangent  vector  A  e  TGe  there  is  a  one-parameter  subgroup  A‘  <=  G 
with  velocity  vector  A  =  (d/dt)\,=0A‘. 

The  degree  of  non-commutativity  of  two  subgroups  A'  and  B'  is  measured 
by  the  product  A,B*A~,B~*.  It  turns  out  that  there  is  one  and  only  one 
subgroup  Cr  for  which 

C5t)  =  o(s2  +  f2)  as  s  and  t  -*  0. 

The  corresponding  vector  C  =  (d/dr)\r=0Cr  is  called  the  Lie  bracket 
C  =  [A,  B]  of  the  vectors  A  and  B.  It  can  be  verified  that  the  operation  of 
Lie  bracket  introduced  in  this  way  makes  the  space  TGe  into  a  Lie  algebra 
(i.e.,  the  operation  is  bilinear,  skew-symmetric,  and  satisfies  the  Jacobi 
identity).  This  algebra  is  called  the  Lie  algebra  of  the  Lie  group  G. 


Problem.  Compute  the  bracket  operation  in  the  Lie  algebra  of  the  group  SO( 3)  of  rotations  in 
three-dimensional  euclidean  space. 

Lemma  1  shows  that  the  Poisson  bracket  of  vector  fields  can  be  defined 
as  the  Lie  bracket  for  the  “infinite-dimensional  Lie  group”  of  all  diffeo- 
morphisms64  of  the  manifold  M . 

63  In  some  riemannian  metric  on  M. 

64  Our  choice  of  sign  in  the  definition  of  Poisson  bracket  was  determined  by  this  correspondence. 
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On  the  other  hand,  the  Lie  bracket  can  be  defined  using  the  Poisson 
bracket  of  vector  fields  on  a  Lie  group  G.  Let  g  e  G.  Right  translation  Rg  is 
the  map  Rg:  G  -*  G,  Rgh  =  hg.  The  differential  of  Rg  at  the  point  e  maps 
TGe  into  TGg.  In  this  way,  every  vector  As  TGe  corresponds  to  a  vector 
field  on  the  group:  it  consists  of  the  right  translations  (Rg)*  A  and  is  called  a 
right-invariant  vector  field.  Clearly,  a  right-invariant  vector  field  on  a  group 
is  uniquely  determined  by  its  value  at  the  identity. 

Problem.  Show  that  the  Poisson  bracket  of  right-invariant  vector  fields  on  a 
Lie  group  G  is  a  right-invariant  vector  field,  and  its  value  at  the  identity  of 
the  group  is  equal  to  the  Lie  bracket  of  the  values  of  the  original  vector  fields 
at  the  identity. 


40  The  Lie  algebra  of  hamiltonian  functions 

The  hamiltonian  vector  fields  on  a  symplectic  manifold  form  a  subalgebra  of  the  Lie  algebra  of 
all  fields.  The  hamiltonian  functions  also  form  a  Lie  algebra:  the  operation  in  this  algebra  is 
called  the  Poisson  bracket  of  functions.  The  first  integrals  of  a  hamiltonian  phase  flow  form  a 
subalgebra  of  the  Lie  algebra  of  hamiltonian  functions. 


A  The  Poisson  bracket  of  two  functions 

Let  (M2n,  co2)  be  a  symplectic  manifold.  To  a  given  function  H :  Mln  -*■  R 
on  the  symplectic  manifold  there  corresponds  a  one-parameter  group 
g}{ :  M2h  -*•  A/2"  of  canonical  transformations  of  M2n— the  phase  flow  of  the 
hamiltonian  function  equal  to  H.  Let  F :  M2n  -*■  R  be  another  function  on  M2". 


Definition.  The  Poisson  bracket  ( F ,  H)  of  functions  F  and  H  given  on  a 
symplectic  manifold  (M2n,  co2)  is  the  derivative  of  the  function  F  in  the 
direction  of  the  phase  flow  with  hamiltonian  function  H: 


(F,  H)(x)  =  jt 


t- 0 


Thus,  the  Poisson  bracket  of  two  functions  on  M  is  again  a  function  on  M. 


Corollary  1.  A  function  F  is  a  first  integral  of  the  phase  flow  with  hamiltonian 
function  H  if  and  only  if  its  Poisson  bracket  with  H  is  identically  zero: 
(F,  H)  =  0. 


We  can  give  the  definition  of  Poisson  bracket  in  a  slightly  different  form 
if  we  use  the  isomorphism  1  between  1 -forms  and  vector  fields  on  a  symplectic 
manifold  (M2n,  co2).  This  isomorphism  is  defined  by  the  relation  (cf.  Section 
37) 

a>2(t|,  I  co1)  =  eo^ti). 

The  velocity  vector  of  the  phase  flow  glH  is  1  dH.  This  implies 
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Corollary  2.  The  Poisson  bracket  of  the  functions  F  and  H  is  equal  to  the 
value  of  the  l-form  dF  on  the  velocity  vector  I  dH  of  the  phase  flow  with 
hamiltonian  function  H  : 

(F,  H )  =  dF(IdH). 

Using  the  preceding  formula  again,  we  obtain 

Corollary  3.  The  Poisson  bracket  of  the  functions  F  and  H  is  equal  to  the 
“ skew  scalar  product"  of  the  velocity  vectors  of  the  phase  flows  with  hamil¬ 
tonian  functions  H  and  F: 

(F,  H)  =  co2(l  dH,  I  dF). 

It  is  now  dear  that 

Corollary  4.  The  Poisson  bracket  of  the  functions  F  and  H  is  a  skew-symmetric 
bilinear  function  of  F  and  H : 

(F,  H )  =  -(H,  F) 
and 

(H,  XxFt  +  X2F2)  =  XfH,  Ff)  +  X2(H,  F2)  (X,  e  R). 

Although  the  arguments  above  are  obvious,  they  lead  to  nontrivial 
deductions,  including  the  following  generalization  of  a  theorem  of  E.  Noether. 

Theorem.  If  a  hamiltonian  function  H  on  a  symplectic  manifold  ( M2n ,  co2) 
admits  the  one-parameter  group  of  canonical  transformations  given  by  a 
hamiltonian  F,  then  F  is  a  first  integral  of  the  system  with  hamiltonian 
function  H. 

Proof.  Since  H  is  a  first  integral  of  the  flow  g'F,  {H,  F)  =  0  (Corollary  1). 
Therefore,  (F,  H)  —  0  (Corollary  4)  and  F  is  a  first  integral  (Corollary  1).  □ 

Problem  1 .  Compute  the  Poisson  bracket  of  two  functions  F  and  H  in  the  canonical  coordinate 
space  R2"  =  {(p,  q)},  «2(i  t|)  =  (/£,  tj). 

Solution.  By  Corollary  3  we  have 


(F.  H)  = 


£  dH  dF 
T,  cp,  d q, 


dH  dF 

oq,  cp, 


(we  use  the  fact  that  /  is  symplectic  and  has  the  form 


in  the  basis  (p,  q)). 


Problem  2.  Compute  the  Poisson  brackets  of  the  basic  functions  p,  and  qr 

Solution.  The  gradients  of  the  basic  functions  form  a  “symplectic  basis”:  their  skew-scalar 
products  are 


(PnPj)  =  (p..4j)  =  (<?,.<?;)  =  0  (if  i  #7)  (q,.p,)  =  -(p,,4.)  =  L 
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Problem  3.  Show  that  the  map  A :  R2"  -►  R2"  sending  (p,  q)  -» (P(p,  q),  Q(p,  q))  is  canonical  if 
and  only  if  the  Poisson  brackets  of  any  two  functions  in  the  variables  (p,  q)  and  (P,  Q)  coincide: 

8H8F  8H8F  dH  dF  BH  dF  _  ^  ^ 

(F,  H)p,q  =  — — ^-(F,  )p,q. 

Solution.  Let  A  be  canonical.  Then  the  symplectic  structures  dp  a  dq  and  dP  a  dQ  coincide. 
But  the  definition  of  the  Poisson  bracket  (F,  H)  was  given  invariantly  in  terms  of  the  symplectic 
structure;  it  did  not  involve  the  coordinates.  Therefore, 

(F,  H)P„  =  (F,  H)  =  (F,  H) p,Q. 

Conversely,  suppose  that  the  Poisson  brackets  (Pt,  Qj) pq  have  the  standard  form  of  Problem  2. 
Then,  clearly,  dP  a  dQ  =  dp  a  dq,  i.e.,  the  map  A  is  canonical. 

Problem  4.  Show  that  the  Poisson  bracket  of  a  product  can  be  calculated  by  Leibniz’s  rule: 

(F1F2,H)  =  F1(F2,  H)  +  F2(F„W). 

Hint.  The  Poisson  bracket  (F[F2,  H)  is  the  derivative  of  the  product  FjF2  in  the  direction 
of  the  field  /  dH. 

B  The  Jacobi  identity 

Theorem.  The  Poisson  bracket  of  three  functions  A,  B,  and  C  satisfies  the 
Jacobi  identity: 

((A,  B ),  C)  +  ((B,  C ),  A)  +  ((C,  A),  B)  =  0. 

Corollary  (Poisson’s  theorem).  The  Poisson  bracket  of  two  first  integrals 
Fy,  F2  of  a  system  with  hamiltonian  function  H  is  again  a  first  integral. 

Proof  of  the  corollary.  By  the  Jacobi  identity, 

((Ft,  F2),  H)  =  (Fu  (F2,  H»  +  (F2,  ( H ,  Fx))  =  0  +  0, 

as  was  to  be  shown.  C 

In  this  way,  by  knowing  two  first  integrals  we  can  find  a  third,  fourth,  etc. 
by  a  simple  computation.  Of  course,  not  all  the  integrals  we  get  will  be 
essentially  new,  since  there  cannot  be  more  than  2 n  independent  functions 
on  M2n.  Sometimes  we  may  get  functions  of  old  integrals  or  constants, 
which  may  be  zero.  But  sometimes  we  do  obtain  new  integrals. 

Problem.  Calculate  the  Poisson  brackets  of  the  components  p!(  p2,Pi,  M2,  M 3  of  the 
linear  and  angular  momentum  vectors  of  a  mechanical  system. 


Answer.  (M1;  M2)  =  Af3,  (Mlt  pA  =  0,  (M,,  p2)  =  p3,  (Mlr  p3)  —  —  p2.  This  implies 

Theorem.  If  two  components,  Mj  and  M2,  of  the  angular  momentum  of  some  mechanical  problem 
are  conserved ,  then  the  third  component  is  also  conserved. 

Proof  of  the  Jacobi  identity.  Consider  the  sum 

(U,  B),  C)  +  ((B,  C),  A)  +  ((C,  A),  B). 
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This  sum  is  a  “linear  combination  of  second  partial  derivatives”  of  the 
functions  A,  B ,  and  C.  We  will  compute  the  terms  in  the  second  derivatives 
of  A: 


((A,  B),  C )  +  ((C,  A),  B )  =  ( LcLb  -  LbLc)A, 

where  L§  is  differentiation  in  the  direction  of  §  and  F  is  the  hamiltonian 
field  with  hamiltonian  function  F. 

But,  by  Lemma  2,  Section  39,  the  commutator  of  the  differentiations 
LcLb  —  LbLc  is  a  first-order  differential  operator.  This  means  that  none 
of  the  second  derivatives  of  A  are  contained  in  our  sum.  The  same  thing  is 
true  for  the  second  derivatives  of  B  and  C.  Therefore,  the  sum  is  zero.  □ 


Corollary  5.  Let  B  and  C  be  hamiltonian  fields  with  hamiltonian  functions 
B  and  C.  Consider  the  Poisson  bracket  [B,  C]  of  the  vector  fields.  This 
vector  field  is  hamiltonian,  and  its  hamiltonian  function  is  equal  to  the 
Poisson  bracket  of  the  hamiltonian  functions  ( B ,  C). 

Proof.  Set  ( B ,  C)  =  D.  The  Jacobi  identity  can  be  rewritten  in  the  form 

{A,  D)  =  ((A,  B),  C)  -  ((A,  C),  B), 


L»  —  LcLb  —  LbLc 
as  was  to  be  shown. 


Ld 


-  L 


[B,  C]  s 


□ 


C  The  Lie  algebras  of  hamiltonian  fields , 
hamiltonian  functions ,  and  first  integrals 

A  linear  subspace  of  a  Lie  algebra  is  called  a  subalgebra  if  the  commutator 
of  any  two  elements  of  the  subspace  belongs  to  it.  A  subalgebra  of  a  Lie 
algebra  is  itself  a  Lie  algebra.  The  preceding  corollary  implies,  in  particular, 


Corollary  6.  The  hamiltonian  vector  fields  on  a  symplectic  manifold  form  a 
subalgebra  of  the  Lie  algebra  of  all  vector  fields. 

Poisson’s  theorem  on  first  integrals  can  be  re-formulated  as 


Corollary  7.  The  first  integrals  of  a  hamiltonian  phase  flow  form  a  subalgebra 
of  the  Lie  algebra  of  all  functions. 

The  Lie  algebra  of  hamiltonian  functions  can  be  mapped  naturally  onto 
the  Lie  algebra  of  hamiltonian  vector  fields.  To  do  this,  to  every  function  H 
we  associate  the  hamiltonian  vector  field  H  with  hamiltonian  function  H. 


Corollary  8.  The  map  of  the  Lie  algebra  of  functions  onto  the  Lie  algebra  of 
hamiltonian  fields  is  an  algebra  homomorphism.  Its  kernel  consists  of  the 
locally  constant  functions.  IfM2"  is  connected,  the  kernel  is  one-dimensional 
and  consists  of  constants. 
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Proof.  Our  map  is  linear.  Corollary  5  says  that  our  map  carries  the  Poisson 
bracket  of  functions  into  the  Poisson  bracket  of  vector  fields.  The  kernel 
consists  of  functions  H  for  which  I  dH  =  0.  Since  I  is  an  isomorphism, 
dH  =  0  and  H  =  const.  □ 

Corollary  9.  The  phase  flows  with  hamiltonian  functions  Hx  and  H2  commute 
if  and  only  if  the  Poisson  bracket  of  the  functions  Hx  and  H2  is  ( locally ) 
constant. 

Proof.  By  the  theorem  in  Section  39,  E,  it  is  necessary  and  sufficient  that 
[Hx,  HJ  =  0,  and  by  Corollary  8  this  condition  is  equivalent  to  d(Hi,  H2) 
=  0.  □ 

We  obtain  yet  another  generalization  of  E.  Noether’s  theorem:  given  a 
flow  which  commutes  with  the  one  under  consideration,  one  can  construct 
a  first  integral. 

D  Locally  hamiltonian  vector  fields 

Let  (M2",  to2)  be  a  symplectic  manifold  and  g‘:  M2n  -»  M2"  a  one-parameter  group  of  diffeo- 
morphisms  preserving  the  symplectic  structure.  Will  g'  be  a  hamiltonian  flow? 


Example.  Let  M2"  be  a  two-dimensional  torus  T2,  a  point  of  which  is  given  by  a  pair  of  co¬ 
ordinates  (p,  q)mod  1.  Let  co2  be  the  usual  area  element  dp  a  dg.  Consider  the  family  of  trans¬ 
lations  g‘(p,  q)  =  (p  +  f,  q)  (Figure  173).  The  maps  g‘  preserve  the  symplectic  structure  (i.e., 
area).  Can  we  find  a  hamiltonian  function  corresponding  to  the  vector  field  (p  =  h  q  =  0)? 
If  p  =  —  dH/dq  and  q  —  dH/dp,  we  would  have  dH/dp  =  0  and  dH/dq  =  —  1,  i.e.,  H  =  —q  +  C. 
But  q  is  only  a  local  coordinate  on  T2;  there  is  no  map  H:  T2  -*■  R  for  which  BHjdp  -  0  and 
dHjdq  =  1.  Thus  g'  is  not  a  hamiltonian  phase  flow. 


Figure  173  A  locally  hamiltonial  field  on  the  torus 

Definition.  A  locally  hamiltonian  vector  field  on  a  symplectic  manifold  ( M2n ,  a)2)  is  the  vector 
field  lu)\  where  co1  is  a  closed  1-form  on  M2n. 

Locally,  a  closed  1-form  is  the  differential  of  a  function,  co1  =  dH.  However,  in  attempting 
to  extend  the  function  H  to  the  whole  manifold  M 2"  we  may  obtain  a  “many-valued  hamiltonian 
function,”  since  a  closed  1-form  on  a  non-simply-connected  manifold  may  not  be  a  differential 
(for  example,  the  form  dq  on  T2).  A  phase  flow  given  by  a  locally  hamiltonian  vector  field  is  called 
a  locally  hamiltonian  flow. 

Problem.  Show  that  a  one-parameter  group  of  diffeomorphisms  of  a  symplectic  manifold  pre¬ 
serves  the  symplectic  structure  if  and  only  if  it  is  a  locally  hamiltonian  phase  flow. 

Hint.  Cf.  Section  38A. 
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Problem.  Show  that  in  the  symplectic  space  R2",  every  one-parameter  group  of  canonical 
diffeomorphisms  (preserving  dp  a  d<\)  is  a  hamiltonian  flow. 

Hint.  Every  closed  1-form  on  R2n  is  the  differential  of  a  function. 

Problem,  Show  that  the  locally  hamiltonian  vector  fields  form  a  sub-algebra  of  the  Lie  algebra 
of  all  vector  fields.  In  addition,  the  Poisson  bracket  of  two  locally  hamiltonian  fields  is  actually 
a  hamiltonian  field,  with  a  hamiltonian  function  uniquely65  determined  by  the  given  fields  S, 
and  tj  by  the  formula  H  =  w2(^,  rj).  Thus,  the  hamiltonian  fields  form  an  ideal  in  the  Lie  algebra 
of  locally  hamiltonian  fields. 


41  Symplectic  geometry 

A  euclidean  structure  on  a  vector  space  is  given  by.  a  symmetric  bilinear  form,  and  a  symplectic 
structure  by  a  skew-symmetric  one.  The  geometry  of  a  symplectic  space  is  different  from  that  of 
a  euclidean  space,  although  there  are  many  similarities. 

A  Symplectic  vector  spaces 

Let  R2n  be  an  even-dimensional  vector  space. 

Definition.  A  symplectic  linear  structure  on  R2n  is  a  nondegenerate66  bi¬ 
linear  skew-symmetric  2-form  given  in  R2".  This  form  is  called  the 
skew-scalar  product  and  is  denoted  by  [£,  if]  =  -  [»|,  £].  The  space  R2", 
together  with  the  symplectic  structure  [  ,  ],  is  called  a  symplectic  vector 
space. 

Example.  Let  (p,, . . . ,  pn,  qu  . . . ,  qn)  be  coordinate  functions  on  R2",  and 
oj2  the  form 

O)2  =  Pi  a  qx  +  •  •  •  +  p„  a  qn. 

Since  this  form  is  nondegenerate  and  skew-symmetric,  it  can  be  taken  for  a 
skew-scalar  product:  [£,q]  =  to2(£,q).  In  this  way  the  coordinate  space 
R2"  =  {(p,  q)}  receives  a  symplectic  structure.  This  structure  is  called  the 
standard  symplectic  structure.  In  the  standard  symplectic  structure  the 
skew-scalar  product  of  two  vectors  £,  and  q  is  equal  to  the  sum  of  the  oriented 
areas  of  the  parallelogram  (£,q)  on  the  n  coordinate  planes  (p, ,  qD. 

Two  vectors  £  and  q  in  a  symplectic  space  are  called  skew-orthogonal 
q)  if  their  skew-scalar  product  is  equal  to  zero. 

Problem.  Show  that  ^  ^  every  vector  is  skew-orthogonal  to  itself. 

The  set  of  all  vectors  skew-orthogonal  to  a  given  vector  q  is  called  the 
skew-orthogonal  complement  to  q. 

65  Not  just  up  to  a  constant. 

66  A  2-form  [  ,  ]  on  R2"  is  nondegenerate  if  ([£,  ij]  =  0,  Vip  =>  (^  =  0). 


219 


8 :  Symplectic  manifolds 


Problem.  Show  that  the  skew-orthogonal  complement  to  ij  is  a  2n  —  1 -dimensional  hyperplane 
containing  ij. 

Hint.  If  all  vectors  were  skew-orthogonal  to  tj,  then  the  form  [  ,  ]  would  be  degenerate. 


B  The  symplectic  basis 

A  euclidean  structure  under  a  suitable  choice  of  basis  (it  must  be  ortho¬ 
normal)  is  given  by  a  scalar  product  in  a  particular  standard  form.  In  exactly 
the  same  way,  a  symplectic  structure  takes  the  standard  form  indicated 
above  in  a  suitable  basis. 

Problem.  Find  the  skew-scalar  product  of  the  basis  vectors  ePi  and  e„(i  =  1  . ..,  n)  in  the  example 
presented  above. 

Solution.  The  relations 

(!)  Op,,  ew]  =  0„,  e4j]  =  0«, ,  ej  =  0  Op,,  =  1 

follow  from  the  definition  of  px  a  qt  +  •  ■  ■  +  p„  a  q„. 


We  now  return  to  the  general  symplectic  space. 

Definition.  A  symplectic  basis  is  a  set  of  2  n  vectors,  ep.,  e4.  (i  =  1,  ...,n) 
whose  scalar  products  have  the  form  (1). 

In  other  words,  every  basis  vector  is  skew-orthogonal  to  all  the  basis 
vectors  except  one,  associated  to  it;  its  product  with  the  associated  vector 
is  equal  to  + 1. 

Theorem.  Every  symplectic  space  has  a  symplectic  basis.  Furthermore,  we  can 
take  any  nonzero  vector  e  for  the  first  basis  vector. 

Proof.  This  theorem  is  entirely  analogous  to  the  corresponding  theorem  in 
euclidean  geometry  and  is  proved  in  almost  the  same  way. 

Since  the  vector  e  is  not  zero,  there  is  a  vector  f  not  skew-orthogonal  to  it 
(the  form  [  ,  ]  is  nondegenerate).  By  choosing  the  length  of  this  vector,  we 
can  insure  that  its  skew-scalar  product  with  e  is  equal  to  1.  In  the  case  n  =  1, 
the  theorem  is  proved. 

If  n  >  1,  consider  the  skew-orthogonal  complement  D  (Figure  174)  to 
the  pair  of  vectors  e,  f.  D  is  the  intersection  of  the  skew-orthogonal  comple¬ 
ments  to  e  and  f.  These  two  2 n  —  1 -dimensional  spaces  do  not  coincide, 


Figure  174  Skew-orthogonal  complement 
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since  e  is  not  in  the  skew-orthogonal  complement  to  f.  Therefore,  their  inter¬ 
section  has  even  dimension  2 rt  —  2. 

We  will  show  that  D  is  a  symplectic  subspace  of  R2",  i.e.,  that  the  skew- 
scalar  product  [  ,  ]  restricted  to  D  is  nondegenerate.  If  a  vector  %eD 
were  skew-orthogonal  to  the  whole  subspace  D,  then  since  it  would  also  be 
skew-orthogonal  to  e  and  to  f,  £  would  be  skew-orthogonal  to  R2n,  which 
contradicts  the  nondegeneracy  of  [  ,  ]  on  R2".  Thus  D2n  2  is  symplectic. 

Now  if  we  adjoin  the  vectors  e  and  f  to  a  symplectic  basis  for  D2n  2  we 
get  a  sympletic  basis  for  R2",  and  the  theorem  is  proved  by  induction  on  n. 

□ 


Corollary.  All  symplectic  spaces  of  the  same  dimension  are  isomorphic. 

If  we  take  the  vectors  of  a  symplectic  basis  as  coordinate  unit  vectors, 
we  obtain  a  coordinate  system  pi5  <j,  in  which  [  ,  ]  takes  the  standard 
form  pj  a  qi  +  ■  ■  ■  +  p„  a  qn.  Such  a  coordinate  system  is  called  sym¬ 
plectic. 

C  The  symplectic  group 

To  a  euclidean  structure  we  associated  the  orthogonal  group  of  linear  map¬ 
pings  which  preserved  the  euclidean  structure.  In  a  symplectic  space  the 
symplectic  group  plays  an  analogous  role. 

Definition.  A  linear  transformation  S\  R2n  -*■  R2n  of  the  symplectic  space 
R2"  to  itself  is  called  symplectic  if  it  preserves  the  skew-scalar  product: 

=  &»,],  V^eR2" 

The  set  of  all  symplectic  transformations  of  R2n  is  called  the  symplectic 
group  and  is  denoted  by  Sp(2n). 

It  is  clear  that  the  composition  of  two  symplectic  transformations  is 
symplectic.  To  justify  the  term  symplectic  group,  we  must  only  show  that  a 
symplectic  transformation  is  nonsingular ;  it  is  then  clear  that  the  inverse  is 
also  symplectic. 

Problem.  Show  that  the  group  Sp{2)  is  isomorphic  to  the  group  of  real  two-by-two  matrices 
with  determinant  1  and  is  homeomorphic  to  the  interior  of  a  solid  three-dimensional  torus. 


Theorem.  A  transformation  S:  R2"  -►  R2"  of  the  standard  symplectic  space 
(p,  q)  is  symplectic  if  and  only  if  it  is  linear  and  canonical,  i.e.,  preserves  the 
differential  2-form 

ai2  =  dpi  A  dqx  +  •  •  •  4-  dp„  a  dq„. 

Proof.  Under  the  natural  identification  of  the  tangent  space  to  R2"  with 
R2n,  the  2-form  co2  goes  to  [  ,  ].  D 
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Corollary.  The  determinant  of  any  symplectic  transformation  is  equal  to  1. 

Proof.  We  already  know  (Section  38B)  that  canonical  maps  preserve  the 
exterior  powers  of  the  form  of1.  But  its  n-th  exterior  power  is  (up  to  a  constant 
multiple)  the  volume  element  on  [R2n.  This  means  that  symplectic  trans¬ 
formations  S  of  the  standard  R2"  =  {(p,  q)}  preserve  the  volume  element, 
so  det  S  =  1.  But  since  every  symplectic  linear  structure  can  be  written  down 
in  standard  form  in  a  symplectic  coordinate  system,  the  determinant  of  a 
symplectic  transformation  of  any  symplectic  space  is  equal  to  1.  □ 

Theorem.  A  linear  transformation  S:  R2"  -*•  R2"  is  symplectic  if  and  only  if  it 
takes  some  ( and  therefore  any)  symplectic  basis  into  a  symplectic  basis . 

Proof.  The  skew-scalar  product  of  any  two  linear  combinations  of  basis  vec¬ 
tors  can  be  expressed  in  terms  of  skew-scalar  products  of  basis  vectors.  If  the 
transformation  does  not  change  the  skew-scalar  products  of  basis  vectors, 
then  it  does  not  change  the  skew-scalar  products  of  any  vectors.  □ 

D  Planes  in  symplectic  space 

In  a  euclidean  space  all  planes  are  equivalent :  each  of  them  can  be  carried  into 
any  other  one  by  a  motion.  We  will  now  look  at  a  symplectic  vector  space 
from  this  point  of  view. 


Problem.  Show  that  a  nonzero  vector  in  a  symplectic  space  can  be  carried  into  any  other  non¬ 
zero  vector  by  a  symplectic  transformation. 

Problem.  Show  that  not  every  two-dimensional  plane  of  the  symplectic  space  R2"  can  be 
obtained  from  a  given  2-plane  by  a  symplectic  transformation. 

Hint.  Consider  the  planes  (pu  p2)  and  (pu  <?,). 

Definition.  A  k-dimensional  plane  (i.e.,  subspace)  of  a  symplectic  space  is 
called  null 67  if  it  is  skew-orthogonal  to  itself,  i.e.,  if  the  skew-scalar  product 
of  any  two  vectors  of  the  plane  is  equal  to  zero. 


Example.  The  coordinate  plane  (p,, . . . ,  pk )  in  the  symplectic  coordinate  system  p,  q  is  null. 
(Prove  it!) 

Problem.  Show  that  any  non-null  two-dimensional  plane  can  be  carried  into  any  other  non¬ 
null  two-plane  by  a  symplectic  transformation. 


For  calculations  in  symplectic  geometry  it  may  be  useful  to  impose  some 
euclidean  structure  on  the  symplectic  space.  We  fix  a  symplectic  coordinate 
system  p,  q  and  introduce  a  euclidean  structure  using  the  coordinate  scalar 
product 

(x, x)  =  Z  pf  +  where  x  =  Z  Piep<  +  fcv 

67  Null  planes  are  also  called  isotropic,  and  for  k  —  n,  lagrangian. 
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The  symplectic  basis  ep,  eq  is  orthonormal  in  this  euclidean  structure.  The 
skew-scalar  product,  like  every  bilinear  form,  can  be  expressed  in  terms  of 
the  scalar  product  by 

(2)  R,  n]  =  (R,  n) 

where  /:  !R2n  -*•  !R2"  is  some  operator.  It  follows  from  the  skew-symmetry  of 
the  skew-scalar  product  that  the  operator  /  is  skew-symmetric. 


Problem.  Compute  the  matrix  of  the  operator  /  in  the  symplectic  basis  ep  ,  e^. 

Answer. 

(° 

\E 

where  E  is  the  n  x  n  identity  matrix. 

Thus,  for  n  =  1  (in  the  p,  g-plane),  /  is  simply  rotation  by  90°,  and  in  the 
general  case  /  is  rotation  by  90°  in  each  of  the  n  planes  p,,  qt. 

Problem.  Show  that  the  operator  I  is  symplectic  and  that  I1  =  -  E2„. 

Although  the  euclidean  structures  and  the  operator  I  are  not  invariantly 
associated  to  a  symplectic  space,  they  are  often  convenient. 

The  following  theorem  follows  directly  from  (2). 

Theorem.  A  plane  n  of  a  symplectic  space  is  null  if  and  only  if  the  plane  In  is 
orthogonal  to  n. 

Notice  that  the  dimensions  of  the  planes  n  and  In  are  the  same,  since  /  is 
nonsingular.  Hence 

Corollary.  The  dimension  of  a  null  plane  in  !R2n  is  less  than  or  equal  to  n. 

This  follows  since  the  two  k -dimensional  planes  n  and  In  cannot  be 
orthogonal  if  k  >  n. 

We  consider  more  carefully  the  n-dimensional  null  planes  in  the  symplectic 
coordinate  space  R2".  An  example  of  such  a  plane  is  the  coordinate  p-plane. 
There  are  in  all  C”2„  n-dimensionat  coordinate  planes  in  [R2n  =  {(p,  q)}. 

Problem.  Show  that  there  are  2"  null  planes  among  the  C2„  n-dimensional  coordinate  planes: 
to  each  of  the  2"  partitions  of  the  set  (1 . n)  into  two  parts  (i,. - iL\  ( i\ . /„_*)  we  asso¬ 
ciate  the  null  coordinate  plane  p, . .  plk,q,r  . .  .  ,qJn^. 

In  order  to  study  the  generating  functions  of  canonical  transformations 
we  need 
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Figure  175  Construction  of  a  coordinate  plane  a  transversal  to  a  given  plane  n. 

Theorem.  Every  n-dimensional  null  plane  n  in  the  symplectic  coordinate  space 
R2"  is  transverse68  to  at  least  one  of  the  2"  coordinate  null  planes. 

Proof.  Let  P  be  the  null  plane  p1?  . . . ,  p„  (Figure  175).  Consider  the  inter¬ 
section  r  =  n  n  P.  Suppose  that  the  dimension  of  t  is  equal  to  k,  0  <  k  <  n. 
Like  every  k -dimensional  subspace  of  the  n-dimensional  space,  the  plane  t  is 
transverse  to  at  least  one  ( n  —  k)-dimensional  coordinate  plane  in  P,  let  us 
say  the  plane 

n  =  r  +  ri  =  P,Tnti=0. 

We  now  consider  the  null  n-dimensional  coordinate  plane 

&  —  (P([5  •  ■  ■  1  Pin  5  •  •  ■  >  t}jkX  t]  —  o  r\  P, 
and  show  that  our  plane  n  is  transverse  to  o: 

n  n  a  =  0. 


We  have 


T 

n 


n,  n  n 

o, o^o 


r  ^  n  | 

n^o\ 


(t  -I-  r\)  ^  (n  n  o)  =>  P  ^  (n  n  o). 


But  P  is  an  n-dimensional  null  plane.  Therefore,  every  vector  skew-orthogonal 
to  P  belongs  to  P  (cf.  the  corollary  above).  Thus  (n  n  o)  c  P.  Finally, 


n  n  o  =  (k  n  P)  n  (o  n  P)  =  t  n  rj  =  0, 
as  was  to  be  shown. 


□ 


Problem.  Let  n,  and  n2  be  two  k -dimensional  planes  in  symplectic  !R2".  Is  it  always  possible  to 
carry  tt,  to  n2  by  a  symplectic  transformation?  How  many  classes  of  planes  are  there  which 
cannot  be  carried  one  into  another? 


Answer,  [kj 2]  +  1,  if  k  <  tv,  [(2 n  —  k)/2 ]  +  1  if  k  >  n. 

E  Symplectic  structure  and  complex  structure 

Since  I2  =  —  £  we  can  introduce  into  our  space  IR2”  not  only  a  symplectic 
structure  [  ,  ]  and  euclidean  structure  (  ,  ),  but  also  a  complex  structure, 
by  defining  multiplication  by  i  =  v/— I  to  be  the  action  of  /.  The  space  R2'1 

68  Two  subspaces  L,  and  L2  of  a  vector  space  L  are  transverse  if  Lt  +  L2  =  L.  Two  n-dimen- 
sional  planes  in  K2"  are  transverse  if  and  only  if  they  intersect  only  in  0. 
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is  identified  in  this  way  with  a  complex  space  C"  (the  coordinate  space  with 
coordinates  zk  =  pk  +  iqk).  The  linear  transformations  of  IR2"  which  preserve 
the  euclidean  structure  form  the  orthogonal  group  0(2 n);  those  preserving 
the  complex  structure  form  the  complex  linear  group  GL(n,  C). 

Problem.  Show  that  transformations  which  are  both  orthogonal  and  symplectic  are  complex, 
that  those  which  are  both  complex  and  orthogonal  are  symplectic,  and  that  those  which  are 
both  symplectic  and  complex  are  orthogonal:  thus  that  the  intersection  of  two  of  the  three 
groups  is  equal  to  the  intersection  of  all  three: 

0(2n)  n  Sp(2n)  —  Sp(2n)  n  GL(n ,  C)  =  GL(n,  C)  n  0(2 n). 

This  intersection  is  called  the  unitary  group  U(n). 


Unitary  transformations  preserve  the  hermitian  scalar  product  (2;,  q)  + 
ifct I]  ;  the  scalar  and  skew-scalar  products  on  IR2"  are  its  real  and  imaginary 
parts. 


42  Parametric  resonance  in  systems  with  many  degrees 
of  freedom 

During  our  investigation  of  oscillating  systems  with  periodically  varying  parameters  (cf.  Section 
25),  we  explained  that  parametric  resonance  depends  on  the  behavior  of  the  eigenvalues  of  a 
certain  linear  transformation  ("the  mapping  at  a  period")-  The  dependence  consists  of  the  fact 
that  an  equilibrium  position  of  a  system  with  periodically  varying  parameters  is  stable  if  the 
eigenvalues  of  the  mapping  at  a  period  have  modulus  less  than  1,  and  unstable  if  at  least  one  of 
the  eigenvalues  has  modulus  greater  than  I. 

The  mapping  at  a  period  obtained  from  a  system  of  Hamilton’s  equations  with  periodic 
coefficients  is  symplectic.  The  investigation  in  Section  25  of  parametric  resonance  in  a  system 
with  one  degree  of  freedom  relied  on  our  analysis  of  the  behavior  of  the  eigenvalues  of  symplectic 
transformations  of  the  plane.  In  this  paragraph  we  will  analyze,  in  an  analogous  way,  the  behavior 
of  the  eigenvalues  of  symplectic  transformations  in  a  phase  space  of  any  dimension.  The  results 
of  this  analysis  (due  to  M .  G.  Krein)  can  be  applied  to  the  study  of  conditions  for  the  appearance 
of  parametric  resonance  in  mechanical  systems  with  many  degrees  of  freedom. 

A  Symplectic  matrices 

Consider  a  linear  transformation  of  a  symplectic  space,  S:  IR2"  -*■  IR2".  Let 
/?!, . . . ,  pn\  qlf  . . . ,  q„  be  a  symplectic  coordinate  system.  In  this  coordinate 
system,  the  transformation  is  given  by  a  matrix  S. 

Theorem.  A  transformation  is  symplectic  if  and  only  if  its  matrix  S  in  the  sym¬ 
plectic  coordinate  system  (p,  q)  satisfies  the  relation 

S' IS  =  I, 

where 


and  S'  is  the  transpose  of  S. 
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Proof.  The  condition  for  being  symplectic  ([Si  St|]  =  [in]  for  all  %  and  n) 
can  be  written  in  terms  of  the  scalar  product  by  using  the  operator  /,  as 


follows : 

asisn)  =  (/in). 

V&n 

or 

(S7S5,  >|)  =  <!%,  n). 

Vi  n, 

as  was  to  be  shown.  D 

B  Symmetry  of  the  spectrum  of  a  symplectic 
transformation 

Theorem.  The  characteristic  polynomial  of  a  symplectic  transformation 

p(X)  =  det(S  —  XE) 
is  reflexive,69  i.e.,  p(X)  =  Xlnp{\jX). 

Proof.  We  will  use  the  facts  that  det  S  =  det  /  =  1,  l2  =  —E,  and  det  A'  = 
det  A.  By  the  theorem  above,  S  =  —IS'1 1.  Therefore, 

p(A)  =  det(S  -  XE)  =  det (-IS'~lI  -  kE)  =  detC-S'-1  +  XE) 

=  det(  —  E  +  XS) 

=  A-de,(S-i£)  =  ^p(i).  □ 


Corollary.  If  X  is  an  eigenvalue  of  a  symplectic  transformation,  then  l/X  is  also 
an  eigenvalue. 

On  the  other  hand,  the  characteristic  polynomial  is  real ;  therefore,  if  X 
is  a  complex  eigenvalue,  then  X  is  an  eigenvalue  different  from  X.  It  follows 
that  the  roots  X  of  the  characteristic  polynomial  lie  symmetrically  with 
respect  to  the  real  axis  and  to  the  unit  circle  (Figure  176).  They  come  in 
4-tuples, 

(|A|#  1,1mA  #0), 

and  pairs  lying  on  the  real  axis, 

X  -  l  -  -  - 

k~  X~  V 

69  A  reflexive  polynomial  is  a  polynomial  a0xm  +  a1xm~l  +  •  ■  •  +  am  which  has  symmetric 
coefficients  a0  =  am,  a,  =  am_  ,,  . . .  . 
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Figure  176  Distribution  of  the  eigenvalues  of  a  symplectic  transformation 
or  on  the  unit  circle, 


It  is  not  hard  to  verify  that  the  multiplicities  of  all  four  points  of  a  4-tuple  (or 
both  points  of  a  pair)  are  the  same. 

C  Stability 

Definition.  A  transformation  S  is  called  stable  if 

Ms  >  0,  35  >  0:  | x |  <  S  =>  |Swx|  <  s,  MN  >  0. 

Problem.  Show  that  if  at  least  one  of  the  eigenvalues  of  a  symplectic  transformation  S  does  not 
lie  on  the  unit  circle,  then  S  is  unstable. 

Hint.  In  view  of  the  demonstrated  symmetry,  if  one  of  the  eigenvalues  does  not  lie  on  the 
unit  circle,  then  there  exists  an  eigenvalue  outside  the  unit  circle  |/t|  >  1 ;  in  the  corresponding 
invariant  subspace,  S  is  an  “expansion  with  a  rotation.” 

Problem.  Show  that  if  all  the  eigenvalues  of  a  linear  transformation  are  distinct  and  lie  on  the 
unit  circle,  then  the  transformation  is  stable. 

Hint.  Change  to  a  basis  of  eigenvectors. 


Definition.  A  symplectic  transformation  S  is  called  strongly  stable  if  every 
symplectic  transformation  sufficiently  close70  to  S  is  stable. 

In  Section  25  we  established  that  S:  (R2  -►  R2  is  strongly  stable  if  A1  2  = 
e±la  and  Ax  ^  A2. 

Theorem.  If  all  2 n  eigenvalues  of  a  symplectic  transformation  S  are  distinct 
and  lie  on  the  unit  circle,  then  S  is  strongly  stable. 

Proof.  We  enclose  the  2 n  eigenvalues  A  in  2 n  non-intersecting  neighborhoods, 
symmetric  with  respect  to  the  unit  circle  and  the  real  axis  (Figure  177).  The 
2 n  roots  of  the  characteristic  polynomial  depend  continuously  on  the  ele¬ 
ments  of  the  matrix  of  S.  Therefore,  if  the  matrix  is  sufficiently  close  to  S, 

70  Si  is  “sufficiently  close”  to  S  if  the  elements  of  the  matrix  of  in  a  fixed  basis  differ  from  the 
elements  of  the  matrix  of  S  in  the  same  basis  by  less  than  a  sufficiently  small  number  s. 
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Figure  177  Behavior  of  simple  eigenvalues  under  a  small  change  of  the  symplectic 
transformation 

exactly  one  eigenvalue  Aj  of  the  matrix  of  will  lie  in  each  of  the  2m  neigh¬ 
borhoods  of  the  2m  points  of  A.  But  if  one  of  the  points  A,  did  not  lie  on  the 
unit  circle,  for  example,  if  it  lay  outside  the  unit  circle,  then  by  the  theorem  in 
subsection  B,  there  would  be  another  point  A2 ,  |  A2 1  <  1  in  the  same  neighbor¬ 
hood,  and  the  total  number  of  roots  would  be  greater  than  2m,  which  is  not 
possible. 

Thus  all  the  roots  of  Sl  lie  on  the  unit  circle  and  are  distinct,  so  S,  is 
stable.  ^ 

We  might  say  that  an  eigenvalue  A  of  a  symplectic  transformation  can 
leave  the  unit  circle  only  by  colliding  with  another  eigenvalue  (Figure  178); 
at  the  same  time,  the  complex-conjugate  eigenvalues  will  collide,  and  from 
the  two  pairs  of  roots  on  the  unit  circle  we  obtain  one  4-tuple  (or  pair  of 
real  A). 


Figure  178  Behavior  of  multiple  eigenvalues  under  a  small  change  of  the  symplectic 
transformation 

It  follows  from  the  results  of  Section  25  that  the  condition  for  parametric 
resonance  to  arise  in  a  linear  canonical  system  with  a  periodically  changing 
hamilton  function  is  precisely  that  the  corresponding  symplectic  transforma¬ 
tion  of  phase  space  should  cease  to  be  stable.  It  is  clear  from  the  theorem 
above  that  this  can  happen  only  after  a  collision  of  eigenvalues  on  the  unit 
circle.  In  fact,  as  M.  G.  Krein  noticed,  not  every  such  collision  is  dangerous. 

It  turns  out  that  the  eigenvalues  A  with  |  A  |  =  1  are  divided  into  two  classes : 
positive  and  negative.  When  two  roots  with  the  same  sign  collide,  the  roots 
“go  through  one  another,”  and  cannot  leave  the  unit  circle.  On  the  other 
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hand,  when  two  roots  with  different  signs  collide,  they  generally  leave  the 
unit  circle. 

M.G.  Krein’s  theory  goes  beyond  the  limits  of  this  book ;  we  will  formulate 
the  basic  results  here  in  the  form  of  problems. 

Problem.  Let  X  and  X.  be  simple  (multiplicity  1)  eigenvalues  of  a  symplectic  transformation  S 
with  U|  =  1.  Show  that  the  two-dimensional  invariant  plane  nx  corresponding  to  A,  X,  is  non¬ 
null. 

Hint.  Let  4i  and  42  be  complex  eigenvectors  of  5  with  eigenvalues  /,  and  X2.  Then  if  XlX2  #  1, 
the  vectors  4i  and  42  are  skew-orthogonal:  [£,,  42]  =  0. 

Let  4  be  a  real  vector  of  the  plane  nx,  where  Im  X  >  0  and  |A|  =  1.  The  eigenvalue  X  is  called 
positive  if  [Si;,  4]  >  0. 

Problem.  Show  that  this  definition  is  correct,  i.e.,  it  does  not  depend  on  the  choice  of  4  #  0  in 
the  plane  nx. 

Hint.  If  the  plane  nx  contained  two  non-collinear  skew-orthogonal  vectors,  it  would  be  null. 

In  the  same  way,  an  eigenvalue  X  of  multiplicity  k  with  |  <i|  =  1  is  of  definite  sign  if  the  quad¬ 
ratic  form  [S4,  4]  is  (positive  or  negative)  definite  on  the  invariant  2fc-dimensional  subspace 
corresponding  to  X,  I. 

Problem.  Show  that  S  is  strongly  stable  if  and  only  if  all  the  eigenvalues  X  lie  on  the  unit  circle 
and  are  of  definite  sign. 

Hint.  The  quadratic  form  [S4,  4]  is  invariant  with  respect  to  S. 

43  A  symplectic  atlas 

In  this  paragraph  we  prove  Darboux’s  theorem,  according  to  which  every  symplectic  manifold 
has  local  coordinates  p,  q  in  which  the  symplectic  structure  can  be  written  in  the  simplest  way: 
a>2  =  t/p  a  dq. 

A  Symplectic  coordinates 

Recall  that  the  definition  of  manifold  includes  a  compatibility  condition  for 
the  charts  of  an  atlas.  This  is  a  condition  on  the  maps  l<Pj  going  from  one 
chart  to  another.  The  maps  <p~  x(pj  are  maps  of  a  region  of  coordinate  space. 

Definition.  An  atlas  of  a  manifold  M2n  is  called  symplectic  if  the  standard 
symplectic  structure  a>2  —  dp  a  dq  is  introduced  into  the  coordinate 
space  IR2"  =  {(p,  q)},  and  the  transfer  from  one  chart  to  another  is  realized 
by  a  canonical  (i.e.,  co2-preserving)  transformation71  <pf  lq>j. 

Problem.  Show  that  a  symplectic  atlas  defines  a  symplectic  structure  on  M2". 

The  converse  is  also  true:  every  symplectic  manifold  has  a  symplectic 
atlas.  This  follows  from  the  following  theorem. 

71  Complex-analytic  manifolds,  for  example,  are  defined  analogously;  there  must  be  a  complex- 
analytic  structure  on  coordinate  space,  and  the  transfer  from  one  chart  to  another  must  be 
complex  analytic. 
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B  Darboux' s  theorem 

Theorem.  Let  o2  be  a  closed  nondegenerate  differential  2-form  in  a  neighbor¬ 
hood  of  a  point  x  in  the  space  1R2".  Then  in  some  neighborhood  of  x  one  can 
choose  a  coordinate  system  (plt . . . ,  p„;  qy, . . . ,  q„)  such  that  the  form  has  the 
standard  form: 

n 

a )2  =  Y.dP>  A  d(L- 

i=  1 

This  theorem  allows  us  to  extend  to  all  symplectic  manifolds  any  assertion 
of  a  local  character  which  is  invariant  with  respect  to  canonical  transforma¬ 
tions  and  is  proven  for  the  standard  phase  space  (IR2n,  of2  =  dp  a  dq). 

C  Construction  of  the  coordinates  px  and  qx 

For  the  first  coordinate  p,  we  take  a  non-constant  linear  function  (we  could 
have  taken  any  differentiable  function  whose  differential  is  not  zero  at  the 
point  x).  For  simplicity  we  will  assume  that  pfx)  =  0. 

Let  Pj  =  /  dpy  denote  the  hamiltonian  field  corresponding  to  the  function 
pt  (Figure  179).  Note  that  Ft(x)  ^  0;  therefore,  we  can  draw  a  hyperplane 
JV2"- 1  through  the  point  x  which  does  not  contain  the  vector  P,(x)  (we 
could  have  taken  any  surface  transverse  to  Pt(x)  as  N2n~l). 


Figure  179  Construction  of  symplectic  coordinates 

Consider  the  hamiltonian  flow  P\  with  hamiltonian  function  pt.  We 
consider  the  time  t  necessary  to  go  from  N  to  the  point  z  =  P\(y)  (y  e  N) 
under  the  action  of  P\  as  a  function  of  the  point  z.  By  the  usual  theorems  in 
the  theory  of  ordinary  differential  equations,  this  function  is  defined  and 
differentiable  in  a  neighborhood  of  the  point  x  e  R2n.  Denote  it  by  qy.  Note 
that  qy  =  0  on  N  and  that  the  derivative  of  qy  in  the  direction  of  the  field  Pj 
is  equal  to  1.  Thus  the  Poisson  bracket  of  the  functions  qy  and  py  we  con¬ 
structed  is  equal  to  1 : 

(qi.P  i)  =  1- 
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D  Construction  of  symplectic  coordinates  by 
induction  on  n 

If  n  =  1,  the  construction  is  finished.  Let  n  >  1.  We  will  assume  that  Dar- 
boux’s  theorem  is  already  proved  for  (R2"~ 2.  Consider  the  set  M  given  by  the 
equations  pt  =  q ,  =  0.  The  differentials  dp j  and  dqx  are  linearly  independent 
at  x  since  co2(I  dpu  I  dqf  =  (quPi)  =  I.  Thus,  by  the  implicit  function 
theorem,  the  set  M  is  a  manifold  of  dimension  2n  —  2  in  a  neighborhood  of 
x;  we  will  denote  it  by  M2"-2. 


Lemma.  The  symplectic  structure  co2  on  IR2"  induces  a  symplectic  structure  on 
some  neighborhood  of  the  point  x  on  A/2"  2. 

Proof.  For  the  proof  we  need  only  the  nondegeneracy  of  a>2  on  TMX. 
Consider  the  symplectic  vector  space  TU2n.  The  vectors  Pt(x)  and  Q,(x) 
of  the  hamiltonian  vector  fields  with  hamiltonian  functions  pr  and  q,  belong 
to  TUl".  Let  TMX.  The  derivatives  of  p,  and  qx  in  the  direction  %  are 
equal  to  zero.  This  means  that  dpdQ  =  co2(4,  Pi)  =  0  and  dqfty  =  a>2(£,  Q*) 
=  0.  Thus  TMX  is  the  skew-orthogonal  complement  to  P^x),  Qi(x).  By 
Section  4 IB,  the  form  at2  on  TMX  is  nondegenerate.  □ 

By  the  induction  hypothesis  there  are  symplectic  coordinates  in  a  neigh¬ 
borhood  of  the  point  x  on  the  symplectic  manifold  (M2"“2,  oj2|m).  Denote 
them  by  p,,q,(i  =  2, . . . ,  n).  We  extend  the  functions  p2, _ qn  to  a  neighbor¬ 

hood  of  x  in  IR2n  in  the  following  way.  Every  point  z  in  a  neighborhood  of 
x  in  IR2"  can  be  uniquely  represented  in  the  form  z  =  where 

w  e  M2"  -2,  and  s  and  t  are  small  numbers.  We  set  the  values  of  the  coor¬ 
dinates  p2, . . . ,  qn  at  z  equal  to  their  values  at  the  point  w  (Figure  179).  The 
2 n  functions  p,, . . . ,  p„,  q qn  thus  constructed  form  a  local  coordinate 
system  in  a  neighborhood  of  x  in  R2". 

E  Proof  that  the  coordinates  constructed  are 
symplectic 

Denote  by  P]  and  Q'  (i  =  1, . . . ,  n)  the  hamiltonian  flows  with  hamiltonian 
functions  p,  and  qh  and  by  P,  and  Q,  the  corresponding  vector  fields.  We  will 
compute  the  Poisson  brackets  of  the  functions  Pi,...,q„.  We  already  saw  in 
C  that  lq ] ,  p j )  =  1.  Therefore,  the  flows  P\  and  Q\  commute:  P\Q\  =  Q\P‘i- 
Recalling  the  definitions  of  p2 , . . . ,  q„  we  see  that  each  of  these  functions  is 
invariant  with  respect  to  the  flows  P\  and  Q\.  Thus  the  Poisson  brackets  of 
Pi  and  <7,  with  all  2n  —  2  functions  pi5  q,  (i  >  1)  are  equal  to  zero. 

The  map  P\Q\  therefore  commutes  with  all  2n  —  2  flows  P[,  Q-  (i  >  1). 
Consequently,  it  leaves  each  of  the  2n  —  2  vector  fields  Pi ,  Q,  (i  >  1 )  fixed. 
PriQ‘i  preserves  the  symplectic  structure  at2  since  the  flows  P\  and  Q\  are 
hamiltonian;  therefore,  the  values  of  the  form  co2  on  the  vectors  of  any  two 
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of  the  2n  -  2  fields  P; ,  Q,  (i  >  1 )  are  the  same  at  the  points  z  =  P\  Q\  weR2" 
and  w  e  M2n  ~  2.  But  these  values  are  equal  to  the  values  of  the  Poisson  brack¬ 
ets  of  the  corresponding  hamiltonian  functions.  Thus,  the  values  of  the 
Poisson  bracket  of  any  two  of  the  In  -  2  coordinates  pi?  qt  (i  >  1)  at  the 
points  z  and  w  are  the  same  if  z  =  P[Qxw. 

The  functions  px  and  qx  are  first  integrals  of  each  of  the  2n  -  2  flows 
P\,  Q*  (i  >  1).  Therefore,  each  of  the  2n  -  2  fields  P;,  Q,  is  tangent  to  the 
level  manifold  px  =  qx  =  0.  But  this  manifold  is  M2n~2.  Therefore,  each  of 
the  2n  —  2  fields  P;,  Q,  (i  >  1)  is  tangent  to  M2n~2.  Consequently,  these 
fields  are  hamiltonian  fields  on  the  symplectic  manifold  (M2n  2,  io2  |M),  and 
the  corresponding  hamiltonian  functions  are  p* |M,  qi\M  (i  >  1)-  Thus,  in  the 
whole  space  (IR2n,  a>2),  the  Poisson  bracket  of  any  two  of  the  2n  -  2  co¬ 
ordinates  p,-,  qt  (i  >  1)  considered  on  M2n  2  is  the  same  as  the  Poisson 
bracket  of  these  coordinates  in  the  symplectic  space  (M2n_  2,  to2  |M). 

But,  by  our  induction  hypothesis,  the  coordinates  on  Mln  2  (p,  Im>  Qi  Im^ 
i  >  1)  are  symplectic.  Therefore,  in  the  whole  space  R2",  the  Poisson  brackets 
of  the  constructed  coordinates  have  the  standard  values 

(Po  Pj)  =  (Pu  <lj)  =  (9o  =  0  and  Pi)  =^ 

The  Poisson  brackets  of  the  coordinates  p,  q  on  R2n  have  the  same  form  if 
oj2  =  YsdPi  A  dQi-  But  a  bilinear  form  0)2  is  determined  by  its  values  on 
pairs  of  basis  vectors.  Therefore,  the  Poisson  brackets  of  the  coordinate 
functions  determine  the  shape  of  co2  uniquely.  Thus 

at2  =  dpx  a  dqx  +  •  •  •  +  dp„  a  dq„, 

and  Darboux’s  theorem  is  proved.  □ 
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y 


The  coordinate  point  of  view  will  predominate  in  this  chapter.  The  technique 
of  generating  functions  for  canonical  transformations,  developed  by 
Hamilton  and  Jacobi,  is  the  most  powerful  method  available  for  integrating 
the  differential  equations  of  dynamics.  In  addition  to  this  technique,  the 
chapter  contains  an  “odd-dimensional”  approach  to  hamiltonian  phase 
flows. 

This  chapter  is  independent  of  the  previous  one.  It  contains  new  proofs 
of  several  of  the  results  in  Chapter  8,  as  well  as  an  explanation  of  the  origin 
of  the  theory  of  symplectic  manifolds. 


44  The  integral  invariant  of  Poincare-Cartan 

In  this  section  we  look  at  the  geometry  of  1-forms  in  an  odd-dimensional  space. 


A  A  hydrodynamical  lemma 

Let  v  be  a  vector  field  in  three-dimensional  oriented  euclidean  space  1 R3, 
and  r  =  curl  v  its  curl.  The  integral  curves  of  r  are  called  vortex  lines.  If  y, 
is  any  closed  curve  in  R3  (Figure  180),  the  vortex  lines  passing  through  the 
points  of  yt  form  a  tube  called  a  vortex  tube. 

Let  y2  be  another  curve  encircling  the  same  vortex  tube,  so  that  yt  —  y2  = 
do,  where  <7  is  a  2-cycle  representing  a  part  of  the  vortex  tube.  Then: 


Stokes’  lemma.  The  field  v  has  equal  circulation  along  the  curves  y1  and  y2  ■ 


O 

& 


7i 


v  di 


v  <21, 

72 
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Proof.  By  Stokes’  formula,  jVi  v  d\  —  jV2v  d\  —  JJ^curl  v  dn  —  0,  since 
is  tangent  to  the  vortex  tube. 


curl  v 

□ 


B  The  multi-dimensional  Stokes’  lemma 

It  turns  out  that  Stokes’  lemma  generalizes  to  the  case  of  any  odd-dimensional 
manifold  M2n+ 1  (in  place  of  R3).  To  formulate  this  generalization  we  replace 
our  vector  field  by  a  differential  form. 

The  circulation  of  a  vector  field  v  is  the  integral  of  the  1-form  co1 
=  (v,  4)).  To  the  curl  of  v  there  corresponds  the  2-form  a>2  =  do1 
n)  =  (r,  n)).  It  is  clear  from  these  formulas  that  there  is  a  direction 


v 


Figure  181  Axis  invariantly  connected  with  a  2-form  in  an  odd-dimensional  space 

at  every  point  (namely,  the  direction  of  r,  Figure  181),  having  the  property 
that  the  circulation  of  v  along  the  boundary  of  every  “  infinitesimal  square 
containing  r  is  equal  to  zero : 

deuHr.ii)  =  0,  Vi|. 

In  fact,  dco1(r,  ij)  —  (r,  r,  ti)  =  0. 

Remark.  Passing  from  the  2-form  co2  =  da)1  to  the  vector  field  r  =  curl  v 
is  not  an  invariant  operation',  it  depends  on  the  euclidean  structure  of  IR  . 
Only  the  direction72  of  r  is  invariantly  associated  with  co2  (and,  therefore, 
with  the  1-form  cu1).  It  is  easy  to  verify  that,  if  r  ^  0,  then  the  direction  of  r 
is  uniquely  determined  by  the  condition  that  ru2(r,  t|)  =  0  f°r  ad  *!• 

72  I.e.,  the  unoriented  line  in  TU3  with  direction  vector  r. 
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The  algebraic  basis  for  the  multi-dimensional  Stokes’  lemma  is  the 
existence  of  an  axis  for  every  rotation  of  an  odd-dimensional  space. 

Lemma.  Let  at2  be  an  exterior  algebraic  2-form  on  the  odd-dimensional  vector 
space  [R2n+  1.  Then  there  is  a  vector  2;  ^  0  such  that 

co2^,  ii)  =  0,  Vtje[R2n+1. 

Proof.  A  skew-symmetric  form  w2  is  given  by  a  skew-symmetric  matrix  A 

n)  =  (A%,  n) 

of  odd  order  2n  4-  1.  The  determinant  of  such  a  matrix  is  equal  to  zero,  since 

A'  =  -A  det  A  =  det  A'  =  det(-/Q  =  (~l)2n+1  det  A  =  -det  A 

Thus  the  determinant  of  A  is  zero.  This  means  A  has  an  eigenvector  2;  ^  0 
with  eigenvalue  0,  as  was  to  be  shown.  □ 


A  vector  §  for  which  co2($,  ij)  =  0,  Vij  is  called  a  null  vector  for  the  form  at2. 
The  null  vectors  of  of1  clearly  form  a  linear  subspace.  The  form  of2  is  called 
nonsingular  if  the  dimension  of  this  space  is  the  minimal  possible  (i.e.,  1 
for  an  odd-dimensional  space  IR2n+ 1  or  0  for  an  even-dimensional  space). 

Problem.  Consider  the  2-form  <x>2  —  dpl  a  dqx  +  •  +  dpn  a  dqn  on  an  even-dimensional 

space  R2n  with  coordinates  p,\  . . . ,  p„; qu  . . . ,  qn.  Show  that  w2  is  nonsingular. 


Problem.  On  an  odd-dimensional  space 


with  coordinates  p, 


sider  the  2-form  w 
nonsingular. 


-  X  4p,  a  dq, ■  —  col  a  dt ,  where  ml  is  any  1-form  on 


p„:  qt - -  q„:  t,  con- 

32"+  1.  Show  that  w2  is 


If  OJ2  is  a  nonsingular  form  on  an  odd-dimensional  space  M2n+1,  then 
the  null  vectors  S,  of  to2  all  lie  on  a  line.  This  line  is  invariantly  associated  to 
the  form  to2. 

Now  let  M2n+1  be  an  odd-dimensional  differentiable  manifold  and  co1 
a  1-form  on  M.  By  the  lemma  above,  at  every  point  xeM  there  is  a  direction 
(i.e.,  a  straight  line  {d;}  in  the  tangent  space  TMX)  having  the  property 
that  the  integral  of  cu1  along  the  boundary  of  an  “infinitesimal  square 
containing  this  direction”  is  equal  to  zero: 

=  0,  Vn  g  TMX. 

Suppose  further  that  the  2-form  dw1  is  nonsingular.  Then  the  direction  § 
is  uniquely  determined.  We  call  it  the  “vortex  direction”  of  the  form  co1. 

The  integral  curves  of  the  field  of  vortex  directions  are  called  the  vortex 
lines  (or  characteristic  lines )  of  the  form  co1. 

Let  7j  be  a  closed  curve  on  M.  The  vortex  lines  going  out  from  points 
of  >4  form  a  “vortex  tube.”  We  have 
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The  multi-dimensional  Stokes’  lemma.  The  integrals  of  a  1  -form  to1  along  any 
two  curves  encircling  the  same  vortex  tube  are  the  same:  fVl  to1  =  f  >2W  > 
if  yl  —  y2  =  da,  where  o  is  a  piece  of  the  vortex  tube. 

Proof.  By  Stokes’  formula 


But  the  value  of  dco1  on  any  pair  of  vectors  tangent  to  the  vortex  tube  is  equal 
to  zero.  (These  two  vectors  lie  in  a  2-plane  containing  the  vortex  direction, 
and  dco1  vanishes  on  this  plane.)  Thus,  dot1  =0.  ^ 

C  Hamilton' s  equations 

All  the  basic  propositions  of  hamiltonian  mechanics  follow  directly  from 
Stokes’  lemma. 

For  M2tt+1  we  will  take  the  “extended  phase  space  D3  n+  with  co¬ 
ordinates  Pl, ... ,  p„;  q„;  t.  Suppose  we  are  given  a  function  H  = 

H( p,  q,  t).  Then  we  can  construct73  the  1-form 

co1  =  p  dq  -  Hdt  (p  dq  =  pt  dq^  +  •  •  •  +  P„  dqn). 

We  apply  Stokes’  lemma  to  co1  (Figure  182). 


Figure  182  Hamiltonian  field  and  vortex  lines  of  the  form  p  dq  -  H  dt. 

Theorem.  The  vortex  lines  of  the  form  to1  =  p  dq  -  Hdt  on  the  2 n  +  1- 
dimensional  extended  phase  space  p,  q,  t  have  a  one-to-one  projection  onto 
the  t  axis,  i.e.,  they  are  given  by  functions  p  =  p(0,  q  =  <1  if).  These  functions 
satisfy  the  system  of  canonical  differential  equations  with  hamiltonian 
function  H: 

dp  8H  dq  _  dH 
^  dt~  dq  ’  dt  dp ' 

In  other  words,  the  vortex  lines  of  the  form  p  dq  -  H  dt  are  the  trajectories 
of  the  phase  flow  in  the  extended  phase  space,  i.e.,  the  integral  curves  of  the 
canonical  equations  (1). 

73  The  form  oj1  seems  here  to  appear  out  of  thin  air.  In  the  following  paragraph  we  will  see  how 
the  idea  of  using  this  form  arose  from  optics. 
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Proof.  The  differential  of  the  form  p  dq  -  H  dt  is  equal  to 


dw1  =  £ 

;=  1 


,  ,  dH  , 

dpt  a  dqt  -  — -  dpt  a  dt  - 
dPi 


dH 

— —  dq:  a  dt 
dqi 


It  is  clear  from  this  expression  that  the  matrix  of  the  2-form  dw1  in  the 
coordinates  p,  q,  t  has  the  form 


where 


A  = 


0  -E  Hp 
E  0  Hq 
Hp  -tfq  0 


n 


E  - 


' - v' 


J 


(verify  this !). 

The  rank  of  this  matrix  is  2 n  (the  upper  left  2n-corner  is  non-degenerate); 
therefore,  da)1  is  nonsingular.  It  can  be  verified  directly  that  the  vector 
1)  is  an  eigenvector  of  A  with  eigenvalue  0  (do  it!).  This  means 
that  it  gives  the  direction  of  the  vortex  lines  of  the  form  p  dq  -  H  dt.  But  the 
vector  (  —  Hq,Hp,  1)  is  also  the  velocity  vector  of  the  phase  flow  of  (1).  Thus 
the  integral  curves  of  (1)  are  the  vortex  lines  of  the  form  p  dq  —  H  dt,  as  was 
to  be  shown.  □ 

D  A  theorem  on  the  integral  invariant  of 
Poincare-Cartan 

We  now  apply  Stokes’  lemma.  We  obtain  the  fundamental 

Theorem.  Suppose  that  the  two  curves  y1  and  y2  encircle  the  same  tube  of 
phase  trajectories  oj  (1).  Then  the  integrals  of  the  form  p  dq  —  H  dt  along 
them  are  the  same: 


p  dq  -  H  dt  = 


yi 


1  p  dq  -  H  dt. 

yi 


The  form  p^q  —  H  dt  is  called  the  integral  invariant  of  Poincare-Cartan.14 

Proof.  The  phase  trajectories  are  the  vortex  lines  of  the  form  p  dq  -  H  dt, 
and  the  integrals  along  closed  curves  contained  in  the  same  vortex  tube  are 
the  same  by  Stokes’  lemma.  □ 


*  In  the  calculus  of  variations  J  p  dq  —  H  dt  is  called  Hilbert's  invariant  integral. 
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<1 


Figure  183  Poincare’s  integral  invariant 


We  will  consider,  in  particular,  curves  consisting  of  simultaneous  states, 
i.e.,  lying  in  the  planes  t  =  const  (Figure  183).  Along  such  curves,  dt  =  0 
and  $  p  dq  -  H  dt  =  f  p  dq.  From  the  preceding  theorem  we  obtain  the 
important: 


Corollary  1.  The  phase  flow  preserves  the  integral  of  the  form  p  dq  = 
Pi  dq!  +  •  •  •  +  p„dq„on  closed  curves. 

Proof.  Let  g K2'1  ->•  IR2"  be  the  transformation  of  the  phase  space  (p,  q) 
realized  by  the  phase  flow  from  time  t0  to  tx  (i.e.,  g‘,0( p0,  qo)  *s  the  solution 
to  the  canonical  equations  (1)  with  initial  conditions  pOo)  =  Po>  fl(^o)  =  Qo)- 
Let  y  be  any  closed  curve  in  the  space  R2n  [R2n+1  ( t  =  t0).  Then  g\l0y 
is  a  closed  curve  in  the  space  IR2"  (t  =  tt),  contained  in  the  same  tube  of 
phase  trajectories  in  R2"+1.  Since  dt  =  0  on  y  and  on  gl'j  we  find  by  the 
preceding  theorem  that  f  y  p  dq  —  j  „  p  dq,  as  was  to  be  shown.  □ 

The  form  p  dq  is  called  Poincare  s  relative  integral  invariant.  It  has  a 
simple  geometric  meaning.  Let  o  be  a  two-dimensional  oriented  chain  and 
y  =  da.  Then,  by  Stokes’  formula,  we  find 

<j)  p  dq  =  JJ  dp  a  dq. 

Thus  we  have  proved  the  important: 


Corollary  2.  The  phase  flow  preserves  the  sum  of  the  oriented  areas  of  the 
projections  of  a  surface  onto  the  n  coordinate  planes  (p,,  qj). 


a  dq. 


In  other  words,  the  2-form  oj2  =  dp  a  dq  is  an  absolute  integral  invariant 
of  the  phase  flow. 


Example.  For  n  —  1,  of2  is  area,  and  we  obtain  Liouvilles  theorem,  the 
phase  flow  preserves  area. 
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E  Canonical  transformations 

Let  g  be  a  differentiable  mapping  of  the  phase  space  IR2"  =  {(p,  q)}  to  IR2". 

Definition.  The  mapping  g  is  called  canonical,  or  a  canonical  transformation, 
if  g  preserves  the  2-form  of1  =  £  dpt  a  dqt . 

It  is  clear  from  the  argument  above  that  this  definition  can  be  written 
in  any  of  three  equivalent  forms : 

1.  g*a )2  =  co2  (g  preserves  the  2-form  £  dpt  a  dqf, 

2.  jj„  of  =  co2,  Vtr  {g  preserves  the  sum  of  the  areas  of  the  projections 
of  any  surface) ; 

3.  |  y  p  dq  =  |  gy  p  dq  (the  form  p  dq  is  a  relative  integral  invariant  of  g). 


Problem.  Show  that  definitions  (1)  and  (2)  are  equivalent  to  (3)  if  the  domain  of  the  map  in 
question  is  a  simply  connected  region  in  the  phase  space  IR2";  in  the  general  case  3  =>  2  «•  1. 

The  corollaries  above  can  now  be  formulated  as : 

Theorem.  The  transformation  of  phase  space  induced  by  the  phase  flow  is 
canonical.15 

Let  g:  IR2"  ->  IR2"  be  a  canonical  transformation:  g  preserves  the  form  co2. 
Then  g  also  preserves  the  exterior  square  of  co2 : 

g*(co2  a  co2)  =  co2  a  co2  and  g*(co2)k  =  ( co2)k . 

The  exterior  powers  of  the  form  £  dp,  a  dqt  are  proportional  to  the  forms 

co4=  £  dpi  a  dpj  a  dqt  a  dq}, 
i<j 

co2k  =  X  dpu  a  •  •  •  a  dpik  A  dqii  a  ■  •  •  a  dqik. 

i  i  <  ■  <  ik 

Thus  we  have  proved 

Theorem.  Canonical  transformations  preserve  the  integral  invariants 
co4, . . co2n. 

Geometrically,  the  integral  of  the  form  co2k  is  the  sum  of  the  oriented 
volumes  of  the  projections  onto  the  coordinate  planes  ( pit , . . . ,  pik ,  qti , . . . ,  qik). 
In  particular,  co2n  is  proportional  to  the  volume  element,  and  we  obtain : 

Corollary.  Canonical  transformations  preserve  the  volume  element  in  phase 
space : 

the  volume  of  gD  is  equal  to  the  volume  of  D,  for  any  region  D. 

75  The  proof  of  this  theorem  which  is  presented  in  the  excellent  book  by  Landau  and  Lifshitz 
( Mechanics ,  Pergamon,  Oxford,  1960)  is  incorrect. 
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In  particular,  applying  this  to  the  phase  flow  we  obtain 
Corollary.  The  phase  flow  (1)  has  as  integral  invariants  the  forms 

(O2,  W4,  .  .  .  ,  0)2n. 

The  last  of  these  invariants  is  the  phase  volume,  so  we  have  again  proved 
Liouville’s  theorem. 

45  Applications  of  the  integral  invariant  of 
Poincare-Cartan 

In  this  paragraph  we  prove  that  canonical  transformations  preserve  the  form  of  Hamilton’s 
equations,  that  a  first  integral  of  Hamilton’s  equations  allows  us  to  reduce  immediately  the  order 
of  the  system  by  two  and  that  motion  in  a  natural  lagrangian  system  proceeds  along  geodesics 
of  the  configuration  space  provided  with  a  certain  riemannian  metric. 

A  Changes  of  variables  in  the  canonical  equations 

The  invariant  nature  of  the  connection  between  the  form  p  dq  —  H  dt  and 
its  curl  lines  gives  rise  to  a  way  of  writing  the  equations  of  motion  in  any 
system  of  In  +  1  coordinates  in  extended  phase  space  {{p,  q,  f)}. 


Figure  184  Change  of  variables  in  Hamilton’s  equations 


Let  (xl5 . . . ,  x2n  + 1)  be  coordinate  functions  in  some  chart  of  extended 
phase  space  (considered  as  a  manifold  M2n+  *,  Figure  184).  The  coordinates 
(p,  q,  t )  can  be  considered  as  giving  another  chart  on  M.  The  form  w1  = 
p  dq  —  H  dt  can  be  considered  as  a  differential  1-form  on  M.  Invariantly 
associated  (not  depending  on  the  chart)  to  this  form  is  a  family  of  lines  on  M— 
the  vortex  lines.  In  the  chart  (p,  q,  t ),  these  lines  are  represented  as  the  tra¬ 
jectories  of  the  phase  flow 

dp  dH  dq  SH 
K  '  dt  dq  dt  dp 

with  hamiltonian  function  H(p,  q,  t). 

Suppose  that  in  the  coordinates  (xx, . . . ,  x2n+  j)  the  form  col  is  written  as 

p  dq  —  H  dt  —  Xi  dx±  +  •  •  +  X2n  + 1  dx2n  +  j. 
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Theorem.  In  the  chart  (x,),  the  trajectories  of  (1)  are  represented  by  the  vortex 
lines  of  the  form  £  Xi  dxt . 

Proof.  The  curl  lines  of  the  forms  £  Xt  dx(  and  p  dq  —  H  dt  are  the  images 
in  two  different  charts  of  the  vortex  lines  of  the  same  form  on  M.  But  the 
integral  curves  of  (1)  are  the  vortex  lines  of  p  dq  -  H  dt.  Thus,  their  images 
in  the  chart  (x,)  are  the  vortex  lines  of  the  form  £  Xt  dx,.  □ 

Corollary.  Let  (Plt . . . ,  P„;  (?1? . . . ,  Q„;  T)  be  a  coordinate  system  on  the 
extended  phase  space  (p,  q,  t)  and  K( P,  Q  T)  and  S( P,  Q,  T)  functions 
such  that 


p  dq  —  H  dt  =  P  dQ  —  K  dT  +  dS 

( the  left -  and  right-hand  sides  are  forms  on  extended  phase  space). 

Then  the  trajectories  of  the  phase  flow  (1)  are  represented  in  the  chart 
(P,  Q,  T )  by  the  integral  curves  of  the  canonical  equations 

dP  _  _  dK  dQ  dK 
(  ’  df  ~  dQ  df  ~  d P 

Proof.  By  the  theorem  above,  the  trajectories  of  (1)  are  represented  by  the 
vortex  lines  of  the  form  P  dQ  —  K  dT  +  dS.  But  dS  has  no  influence  on 
the  vortex  lines  (since  ddS  =  0).  Therefore,  the  images  of  the  trajectories  of  ( 1) 
are  the  vortex  lines  of  the  form  PdQ  —  K  dT.  According  to  Section  44,  C, 
the  vortex  lines  of  such  a  form  are  integral  curves  of  the  canonical  equations 
(2).  □ 

In  particular,  let  g:  [R2n  -*■  (R2"  be  a  canonical  transformation  of  phase 
space  taking  a  point  with  coordinates  (p,  q)  to  a  point  with  coordinates 
(P,  Q).  The  functions  P(p,  q)  and  Q(p,  q)  can  be  considered  as  new  co¬ 
ordinates  on  phase  space. 


Theorem.  In  the  new  coordinates  (P,Q)  the  canonical  equations  (1)  have 
the  canonical  form 76 

dP  _  _  dK  dQ_dK 

(  }  ~dt~~dQ  7T~JP 

with  the  same  hamiltonian  function:  /C(P,  Q,  t)  =  H( p,  q,  t). 


76  In  some  textbooks  the  property  of  preserving  the  canonical  form  of  Hamilton's  equations  is 
taken  as  the  definition  of  a  canonical  transformation.  This  definition  is  not  equivalent  to  the 
generally  accepted  one  mentioned  above.  For  example,  the  transformation  P  =  2p,  Q  =  q, 
which  is  not  canonical  by  our  definition,  preserves  the  hamiltonian  form  of  the  equations  of 
motion.  This  confusion  appears  even  in  the  excellent  textbook  by  Landau  and  Lifshitz  ( Mechanics , 
Oxford,  Pergamon,  1960);  in  Section  45  of  this  book  they  show  that  every  transformation  which 
preserves  the  canonical  equations  is  canonical  in  our  sense. 
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Q 


Figure  185  Closedness  of  the  form  p  4q  —  P  dQ 


Proof.  Consider  the  1-form  p  dq  —  P  dQ  on  R2n.  For  any  closed  curve  y 
we  have  (Figure  185) 

(j)  p  dq  -  P  dQ  =  (f)  p  dq  -  <f)  P  dQ  =  0 

J  y  y  v  y 

since  g  is  canonical.  Therefore,  ^  p  dq  —  P  dQ  =  S  does  not  depend  on 
the  path  of  integration  but  only  on  the  endpoint  (pl5  qx)  (for  a  fixed  initial 
point  (p0,q0))-  Thus  dS  =  pdq  -  P dQ.  Consequently,  in  the  extended 
phase  space,  we  have 

p  dq  —  H  dt  =  P  dQ  —  H  dt  +  dS. 

Thus,  the  theorem  above  is  applicable,  and  (2)  is  transformed  to  (3).  □ 

Problem.  Let  g(r):  R2"  -+  R2n  be  a  canonical  transformation  of  phase  space  depending  on  the 
parameter  t,  ^(r)(p,  q)  =  (P(p,  q,  t),  Q(p,  q,  r)).  Show  that  in  the  variables  P,  Q,  t  the  canonical 
equations  (1)  have  the  canonical  form  with  new  hamiltonian  function 

3  S 

K( P,  Q,  r)  =  H{ p,  q,  f)  +  —  +  PQ, 
dt 


where 


S(p„qi,  0=1  Pd<l  -  PdQ 

"  Po.  <|0 

B  Reduction  of  order  using  the  energy  integral 

Suppose  now  that  the  hamiltonian  function  H( p,  q)  does  not  depend  on  time. 
Then  the  canonical  equations  (1)  have  a  first  integral:  H(p(t),  q(f))  =  const. 
It  turns  out  that  by  using  this  integral  we  can  reduce  the  dimension  (2n  +  1) 
of  the  extended  phase  space  by  two,  thereby  reducing  the  problem  to  in¬ 
tegration  of  a  system  of  canonical  equations  in  a  (2n  —  l)-dimensional  space. 

We  assume  that  (in  some  region)  the  equation  h  =  H(pu . . .  ,p„;  q^. . . ,  qn) 
can  be  solved  for  pt : 

Pi  =  K(P,  Q,  T;  h\ 
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where?  =  (p2,...,p„);  Q  =  (q2,...,qn);  T  =  -qv  Then  we  find 
p  dq  -  H  dt  =  P  dQ  -  K  dT  -  d(Ht )  +  t  dH. 

Now  let  y  be  an  integral  curve  of  the  canonical  equations  (1)  lying  on  the 
2n-dimensional  surface  H( p,  q)  =  h  in  R2"  + 1.  Then  y  is  a  vortex  line  of  the 
form  p  dq  —  H  dt  (Figure  186).  We  project  the  extended  phase  space  R2n+ 1  = 
{(p,  q,  f)}  onto  the  phase  space  R2"  =  {(p,  q)}.  The  surface  H  =  h  is  pro¬ 
jected  onto  a  (2 n  —  l)-dimensional  manifold  Mln~ 1 :  f/(p,  q)  —  h  in  R2n, 
and  y  is  projected  to  a  curve  y  lying  on  this  submanifold.  The  variables 
P,  Q,  T  form  local  coordinates  on  M2n~  *. 


Q 


Figure  186  Lowering  the  order  of  a  hamiltonian  system 

Problem.  Show  that  the  curve  y  is  a  vortex  line  of  the  form  p  dq  =  P  dQ  —  K  dT  on  M2n~ 1 
Hint.  d(Ht)  does  not  affect  the  vortex  lines,  and  dH  is  zero  on  M. 


But  the  vortex  lines  of  P  dQ  —  K  dT  satisfy  Hamilton’s  equations  (2). 
Thus  we  have  proved 

Theorem.  The  phase  trajectories  of  the  equations  (1)  on  the  surface  A/2"-1, 
H  =  h,  satisfy  the  canonical  equations 

dpi  dK  dqt  dK 

=  T~  =  “  T“»  (i  =  2, . . . ,  n), 

dql  dqt  dq t  dpt 

where  the  function  K(p2 , . . . ,  pn;  q2 , . . . ,  q„]  T,h)  is  defined  by  the  equation 
H(K,  p2,...,p„;  ~T,q2,...,q„)  =  h. 

C  The  principle  of  least  action  in  phase  space 

In  the  extended  phase  space  {(p,  q,  r)},  we  consider  an  integral  curve  of  the 
canonical  equations  (1)  connecting  the  points  (p0,  q0,  t0)  and  (pl5  ql5  tt). 

Theorem.  The  integral  j  p  dq  —  H  dt  has  y  as  an  extremal  under  variations 
of  y  for  which  the  ends  of  the  curve  remain  in  the  n-dimensional  subspaces 
(t  =  t0,q  =  q0)  and  ( t  =  ty,  q  =  qx). 

Proof.  The  curve  y  is  a  vortex  line  of  the  form  p  dq  —  H  dt  (Figure  187). 
Therefore,  the  integral  of  p  dq  —  H  dt  over  an  “infinitely  small  parallelogram 
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space 


passing  through  the  vortex  direction”  is  equal  to  zero.  In  other  words,  the 
increment  jv  —  Jy  p  dq  —  H  dt  is  small  to  a  higher  order  in  comparison  with 
the  difference  of  the  curves  y  and  /,  as  was  to  be  shown. 

If  this  argument  does  not  seem  rigorous  enough,  it  can  be  replaced  by  the 
computation 


<5 


H)dt  =  [  qc5p  +  p<5q 


=  p5q 


,  +  /[ 


m  m  \ 

{.  dH\z  (.  BH\,  ' 

-  (p + *r. 


dt. 


We  see  that  the  integral  curves  of  Hamilton’s  equations  are  the  only 
extremals  of  the  integral  J  p  dq  —  H  di  in  the  class  of  curves  y  whose  ends 
lie  in  the  n-dimensional  subspaces  ( t  =  t0,q  =  q0)  and  (t  =  tlfq  =  qx) 
of  extended  phase  space.  □ 


Remark.  The  principle  of  least  action  in  Hamilton’s  form  is  a  particular  case  of  the  principle 
considered  above.  Along  extremals,  we  have 

rfj.m  r*i  r* « 

p  dq  -  H  dt  =  (pq  —  H)  dt  =  L  dt 

*^o.  qn  ^10  ^<o 

(since  the  lagrangian  L  and  the  hamiltonian  H  are  Legendre  transforms  of  one  another).  Now 
let  y  (Figure  188)  be  the  projection  of  the  extremal  y  onto  the  q,  r  plane.  To  any  nearby  curve  y' 
connecting  the  same  points  (r0,  q0)  and  (r | ,  q[)  in  the  q,  r  plane  we  associate  a  curve  y'  in  the 


Figure  188  Comparison  curves  for  the  principles  of  least  action  in  the  configuration 
and  phase  spaces 
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phase  space  (p,  q,  t)  by  setting  p  =  dL/d q.  Then,  along  too,  j,.  p  r/q  -  H  dt  =  jf.  L  dt.  But 
by  the  theorem  above,  d  {  p  dq  -  H  dt  =  0  for  any  variation  curve  y  (with  boundary  conditions 
(f  =  r0,  q  =  %)  and  (f  =  f,,  q  =  q , ).  In  particular,  this  is  true  for  variations  of  the  special  form 
taking  y  to  y'.  Thus  y  is  an  extremal  of  f  L  dt,  as  was  to  be  shown. 


In  the  theorem  above  we  are  allowed  to  compare  y  with  a  significantly 
wider  class  of  curves  y'  than  in  Hamilton’s  principle :  there  are  no  restrictions 
placed  on  the  relation  of  p  with  q.  Surprisingly,  one  can  show  that  the  two 
principles  are  nevertheless  equivalent :  an  extremal  in  the  narrower  class  of 
variations  (p  =  dL/d q)  is  an  extremal  under  all  variations.  The  explana¬ 
tion  is  that,  for  fixed  q,  the  value  p  =  dL/d q  is  an  extremal  of  pq  —  H  (cf.  the 
definition  of  the  Legendre  transform,  Section  14). 

D  The  principle  of  least  action  in  the 

Maupertuis-Euler-Lagrange- Jacobi  form 

Suppose  now  that  the  hamiltonian  function  H{ p,  q)  does  not  depend  on  time. 
Then  H(p,  q)  is  a  first  integral  of  Hamilton’s  equations  (1).  We  project  the 
surface  H( p,  q)  =  h  from  the  extended  phase  space  {(p,  q,  t)}  to  the  space 
{(p,  q)}.  We  obtain  a  (2rt  —  l)-dimensional  surface  H( p,  q)  =  h  in  [R2n, 
which  we  already  studied  in  subsection  B  and  which  we  denoted  by  M2n~ l. 

The  phase  trajectories  of  the  canonical  equations  (1)  beginning  on  the 
surface  M2"-1  lie  entirely  in  A/2"-1.  They  are  the  vortex  lines  of  the  form 
p  dq  =  P  dQ  —  K  dT  (in  the  notation  of  B)  on  A/2"'1.  By  the  theorem  in 
subsection  C,  the  curves  (1)  on  M2n~l  are  extremals  for  the  variational 
principle  corresponding  to  this  form.  Therefore,  we  have  proved 

Theorem.  If  the  hamiltonian  function  H  =  H( p,  q)  does  not  depend  on  time, 
then  the  phase  trajectories  of  the  canonical  equations  (1)  lying  on  the  surface 
M2n~ 1 :  H(p,  q)  =  h  are  extremals  of  the  integral  J  p  dq  in  the  class  of 
curves  lying  on  M2n~ 1  and  connecting  the  subspaces  q  =  q0  and  q  =  qi. 

We  now  consider  the  projection  onto  the  q-space  of  an  extremal  lying 
on  the  surface  H( p,  q)  =  h.  This  curve  connects  the  points  q0  and 

qt.  Let  y  be  another  curve  connecting  the  points  q0  and  q!  (Figure  189). 
The  curve  y  is  the  projection  of  some  curve  y  on  M2"-1.  Specifically,  we 


p 


Figure  189  Maupertuis’  principle 
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parametrize  y  by  t,  a  <  x  <  b,  y(a)  =  q0,  y(b)  =  qt.  Then  at  every  point  q 
of  y  there  is  a  velocity  vector  q  =  dy{z)/dx,  and  the  corresponding  momentum 
p  =  dL/d q.  If  the  parameter  t  is  chosen  so  that  H(p,  q)  =  h,  then  we  obtain 
a  curve  q  =  y(r),  p  =  dL/dq  on  the  surface  Applying  the  theorem 

above  to  the  curve  £  on  M2n~1,  we  obtain 


Corollary.  Among  all  curves  q  =  y(r)  connecting  the  two  points  q0  and  qj  on 
the  plane  q  and  parametrized  so  that  the  hamiltonian  Junction  has  a  Jixed 
value  H(8L/8q,  q)  =  h,  the  trajectory  of  the  equations  of  dynamics  (1)  is 
an  extremal  of  the  integral  of  “reduced  action ” 


This  is  also  the  principle  of  least  action  of  Maupertuis  (Euler- Lagrange- 
Jacobi).77  It  is  important  to  note  that  the  interval  a  <  t  <  b  parametrizing 
the  curve  y  is  not  fixed  and  can  be  different  for  different  curves  being  com¬ 
pared.  On  the  other  hand,  the  energy  (the  hamiltonian  function)  must  be 
the  same.  We  note  also  that  the  principle  determines  the  shape  of  a  trajectory 
but  not  the  time:  in  order  to  determine  the  time  we  must  use  the  energy 
constant. 

The  principle  above  takes  a  particularly  simple  form  in  the  case  when  the 
system  represents  inertial  motion  on  a  smooth  manifold. 


Theorem.  A  point  mass  confined  to  a  smooth  riemannian  manifold  moves  along 
geodesic  lines  ( i.e .,  along  extremals  of  the  length  j  ds). 

Proof.  In  this  case, 


Therefore,  in  order  to  guarantee  a  fixed  value  of  H  =  h,  the  parameter  must 
be  chosen  proportional  to  the  length  di  =  ds/^flk.  The  reduced  action 
integral  is  then  equal  to 


2  h  ds  =  yj2  h  ds ; 

v 


therefore,  extremals  are  geodesics  of  our  manifold. 


□ 


In  the  case  when  there  is  a  potential  energy,  the  trajectories  of  the  equa¬ 
tions  of  dynamics  are  also  geodesics  in  a  certain  riemannian  metric. 


77  “In  almost  all  textbooks,  even  the  best,  this  principle  is  presented  so  that  it  is  impossible  to 
understand.”  (K,  Jacobi,  Lectures  on  Dynamics,  1842-1843).  I  do  not  choose  to  break  with 
tradition,  A  very  interesting  “proof”  of  Maupertuis’  principle  is  in  Section  44  of  the  mechanics 
textbook  of  Landau  and  Lifshitz  ( Mechanics ,  Oxford,  Pergamon,  1960). 
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Let  ds2  be  a  riemannian  metric  on  configuration  space  which  gives  the 
kinetic  energy  (so  that  T  =  j(ds/dx)2).  Let  h  be  a  constant. 

Theorem.  In  the  region  of  configuration  space  where  t/(q)  <  h  we  define 
a  riemannian  metric  by  the  formula 

dp  =  Jh  -  t/(q)  ds. 

Then  the  trajectories  of  the  system  with  kinetic  energy  T  =  \{ds/dz)2, 
potential  energy  U (q),  and  total  energy  h  will  be  geodesic  lines  of  the  metric 
dp. 

Proof.  In  this  case  L  =  T  —  U,  H  =  T  +  U,  and  (dL/d q)q  =  2 T  = 
(ds/dx)2  =  2(h  —  U).  Therefore,  in  order  to  guarantee  a  fixed  value  of 
H  —  h,  the  parameter  x  must  be  chosen  proportional  to  length:  dx  = 
ds/y/2(h  —  U).  The  reduced  action  integral  will  then  be  equal  to 

J  4  dx  =  J  y/2(h  -  U )  ds  =  yj 2  J  dp. 

By  Maupertuis’  principle,  the  trajectories  are  geodesics  in  the  metric  dp, 
as  was  to  be  shown.  □ 

Remark  1.  The  metric  dp  is  obtained  from  ds  by  a  “stretching”  depending 
on  the  point  q  but  not  depending  on  the  direction.  Therefore,  angles  in  the 
metric  dp  are  the  same  as  angles  in  the  metric  ds.  On  the  boundary  of  the 
region  U  <  h  the  metric  dp  has  a  singularity:  the  closer  we  come  to  the 
boundary,  the  smaller  the  p-length  becomes.  In  particular,  the  length  of  any 
curve  lying  in  the  boundary  (U  =  h)  is  equal  to  zero. 

Remark  2.  If  the  initial  and  endpoints  of  a  geodesic  y  are  sufficiently  close, 
then  the  extremum  of  length  is  a  minimum.  This  justifies  the  name  “principle 
of  least  action.”  In  general,  an  extremum  of  the  action  is  not  necessarily  a 
minimum,  as  we  see  by  considering  geodesics  on  the  unit  sphere  (Figure  190). 
Every  arc  of  a  great  circle  is  a  geodesic,  but  only  those  with  length  less  than  n 
are  minimal:  the  arc  NS'M  is  shorter  than  the  great  circle  arc  NSM. 


Figure  190  Non-minimal  geodesic 
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Remark  3.  If  h  is  larger  than  the  maximum  value  of  U  on  the  configuration 
space,  then  the  metric  dp  has  no  singularities;  therefore,  we  can  apply 
topological  theorems  about  geodesics  on  riemannian  manifolds  to  the  study 
of  mechanical  systems.  For  example,  we  consider  the  torus  T2  with  some 
riemannian  metric.  Among  all  closed  curves  on  T2  making  m  rotations 


Figure  191  Periodic  motion  of  a  double  pendulum 


around  the  parallel  and  n  around  the  meridian,  there  exists  a  curve  of  shortest 
length  (Figure  191).  This  curve  is  a  closed  geodesic  (for  a  proof  see  books 
on  the  calculus  of  variations  or  “Morse  theory”).  On  the  other  hand,  the 
torus  T2  is  the  configuration  space  of  a  planar  double  pendulum.  Therefore, 

Theorem.  For  any  integers  m  and  n  there  is  a  periodic  motion  of  the  double 
pendulum  under  which  one  segment  makes  m  rotations  while  the  other 
segment  makes  n  rotations. 

Furthermore,  such  periodic  motions  exist  for  any  sufficiently  large  values 
of  the  constant  h  (h  must  be  larger  than  the  potential  energy  at  the  highest 
position). 

As  a  last  example  we  consider  a  rigid  body  fastened  at  a  stationary  point 
and  located  in  an  arbitrary  potential  field.  The  configuration  space  (SO(3)) 
is  not  simply  connected :  there  exist  non-contractible  curves  in  it.  The  above 
arguments  imply 

Theorem.  In  any  potential  force  field,  there  exists  at  least  one  periodic  motion 
of  the  body.  Furthermore,  there  exist  periodic  motions  for  which  the  total 
energy  h  is  arbitrarily  large. 


46  Huygens’  principle 

The  fundamental  notions  of  hamiltonian  mechanics  (momenta,  the  hamiltonian  function  H, 
the  form  p  dq  -  H  dt  and  the  Hamilton-Jacobi  equations,  all  of  which  we  will  be  concerned 
with  below)  arose  by  the  transforming  of  several  very  simple  and  natural  notions  of  geometric 
optics,  guided  by  a  particular  variational  principle— that  of  Fermat,  into  general  variational 
principles  (and  in  particular  into  Hamilton’s  principle  of  stationary  action,  5  \  L  dr  =  0). 
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A  Wave  fronts 

We  consider  briefly78  the  fundamental  notions  of  geometric  optics.  According 
to  the  extremal  principle  of  Fermat,  light  travels  from  a  point  q0  to  a  point 
qj  in  the  shortest  possible  time.  The  speed  of  the  light  can  depend  both  on  the 
point  q  (an  “inhomogeneous  medium”)  and  on  the  direction  of  the  ray 
(in  an  “anisotropic  medium,”  such  as  a  crystal).  The  characteristics  of  a 
medium  can  be  described  by  giving  a  surface  (the  “indicatrix  ”)  in  the  tangent 
space  at  each  point  q.  To  do  this,  we  take  in  every  direction  the  velocity  vector 
of  the  propagation  of  light  at  the  given  point  in  the  given  direction  (Figure 
192). 


Figure  192  An  anisotropic,  inhomogeneous  medium 


Figure  193  Envelope  of  wave  fronts 


Now  let  t  >  0.  We  look  at  the  set  of  all  points  q  to  which  light  from  a  given 
point  q0  can  travel  in  time  less  than  or  equal  to  t.  The  boundary  of  this  set, 
<Fqo(f),  is  called  the  wave  front  of  the  point  q0  after  time  t  and  consists  of  points 
to  which  light  can  travel  in  time  t  and  not  faster. 

There  is  a  remarkable  relation,  discovered  by  Huygens,  between  the  wave 
fronts  corresponding  to  different  values  of  t.  (Figure  193) 

78  We  will  not  pursue  rigor  here,  and  will  assume  that  all  determinants  are  different  from  zero, 
etc.  The  proofs  of  the  subsequent  theorems  do  not  depend  on  the  semi-heuristic  arguments  of 
this  paragraph. 
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Huygens’  theorem.  Let  <J>qo(t)  be  the  wave  front  of  the  point  q0  after  time  t. 

For  every  point  q  of  this  front,  consider  the  wave  front  after  time  s,  d>q(s). 

Then  the  wave  front  of  the  point  q0  after  time  s  +  t,  Oq,  (s  +  t),  will  be  the 

envelope  of  the  fronts  <Dq(s),  q  e  Oqo(t). 

Proof.  Let  q,+s  e  <I>qo(r  +  5).  Then  there  exists  a  path  from  q0  to  q(  +  s  along 
which  the  time  of  travel  of  light  equals  t  +  s,  and  there  is  none  shorter.  We 
look  at  the  point  q,  on  this  path,  to  which  light  travels  in  time  t.  No  shorter 
path  from  q0  to  qr  can  exist;  otherwise,  the  path  q0qr  +  s  would  not  be  the 
shortest.  Therefore,  the  point  q,  lies  on  the  front  <l>qo(r).  In  exactly  the  same 
way  light  travels  the  path  q,qt+i  in  time  s,  and  there  is  no  shorter  path  from 
q(  to  q(  +  s.  Therefore,  the  point  qf  +  s  lies  on  the  front  of  the  point  q,  at  time  s, 
<J>qt(s).  We  will  show  that  the  fronts  <Dqt(s)  and  d>qo(r  +  s)  are  tangent.  In 
fact,  if  they  crossed  each  other  (Figure  194),  then  it  would  be  possible  to 
reach  some  points  of  <I>qo(r  +  s)  from  qr  in  time  less  than  s,  and  therefore 
from  q0  in  time  less  than  s  +  f.  This  contradicts  the  definition  of  <I>qo(r  +  s); 
and  so  the  fronts  <J>qt(s)  and  <I>qo{f  4-  s)  are  tangent  at  the  point  qt  +  s,  as  was 
to  be  proved.  □ 


Figure  194  Proof  of  Huygens’  theorem 

The  theorem  which  has  been  proved  is  called  Huygens’  principle.  It  is 
clear  that  the  point  q0  could  be  replaced  by  a  curve,  surface,  or,  in  general, 
by  a  closed  set,  the  three-dimensional  space  {q}  by  any  smooth  manifold, 
and  propagation  of  light  by  the  propagation  of  any  disturbance  transmitting 
itself  “locally.” 

Huygens’  principle  reduces  to  two  descriptions  of  the  process  of  prop¬ 
agation.  First,  we  can  trace  the  rays,  i.e.,  the  shortest  paths  of  the  propagation 
of  light.  In  this  case  the  local  character  of  the  propagation  is  given  by  a 
velocity  vector  q.  If  the  direction  of  the  ray  is  known,  then  the  magnitude 
of  the  velocity  vector  is  given  by  the  characteristics  of  the  medium  (the 
indicatrix). 

On  the  other  hand,  we  can  trace  the  wave  fronts.  Assuming  that  we  are 
given  a  riemannian  metric  on  the  space  {q},  we  can  talk  about  the  velocity 
of  motion  of  the  wave  front.  We  look,  for  example,  at  the  propagation  of 
light  in  a  medium  filling  ordinary  euclidean  space.  Then  one  can  characterize 
the  motion  of  the  wave  front  by  a  vector  p  perpendicular  to  the  front,  which 
will  be  constructed  in  the  following  manner. 
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Direction  of  the  ray 


Figure  195  Direction  of  a  ray  and  direction  of  motion  of  the  wave  front 

For  every  point  q0  we  define  the  function  Sqo(q)  as  the  optical  length  of 
the  path  from  q0  to  q,  i.e.,  the  least  time  of  the  propagation  of  light  from  q0 
to  q.  The  level  set  {q:  Sqo(q)  =  t}  is  nothing  other  than  the  wave  front  <I>qo(f) 
(Figure  195).  The  gradient  of  the  function  S  (in  the  sense  of  the  metric 
mentioned  above)  is  perpendicular  to  the  wave  front  and  characterizes  the 
motion  of  the  wave  front.  In  this  connection,  the  bigger  the  gradient,  the 
slower  the  front  moves.  Therefore,  Hamilton  called  the  vector 


8S 


the  vector  of  normal  slowness  of  the  front. 

The  direction  of  the  ray  q  and  the  direction  of  motion  of  the  front  p  do  not 
coincide  in  an  anisotropic  medium.  However,  they  are  related  to  one  another 
by  a  simple  relationship,  easily  derived  from  Huygens’  principle.  Recall 
that  the  characteristics  of  the  medium  are  at  every  point  described  by  a 
surface  of  velocity  vectors  of  light— the  indicatrix. 

Definition.  The  direction  of  the  hyperplane  tangent  to  the  indicatrix  at  the 
point  v  is  called  conjugate  to  the  direction  v  (Figure  196). 

Theorem.  The  direction  of  the  wave  front  ^qo(t)  at  the  point  q,  is  conjugate 
to  the  direction  of  the  ray  q. 

Proof.  We  look  (Figure  197)  at  points  qT  of  the  ray  q0qt,  0  <  t  <  t.  Take  £ 
very  small.  Then  the  front  (e)  differs  by  quantities  of  order  0{e2)  from 
the  indicatrix  at  the  point  q(,  contracted  by  s.  By  Huygens’  principle,  this 
front  <D  c(£)  is  tangent  to  the  front  d>qo(t)  at  the  point  q(.  Passing  to  the  limit 
as  e  -►  0,  we  obtain  the  theorem.  □ 


V 


Figure  196  Conjugate  hyperplane 
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Figure  197  Conjugacy  of  the  direction  of  a  wave  and  of  the  front 


If  the  auxiliary  metric  used  to  define  the  vector  p  is  changed,  the  natural 
velocity  of  the  motion  of  the  front,  i.e.  both  the  magnitude  and  direction  of 
the  vector  p,  will  be  changed.  However,  the  differential  form  pdq  =  dS 
on  the  space  {q}  =  R3  is  defined  in  a  way  which  is  independent  of  the 
auxiliary  metric;  its  value  depends  only  on  the  chosen  fronts  (or  rays).  On  the 
hyperplane  conjugate  to  the  velocity  vector  of  a  ray,  this  form  is  equal  to 
zero,  and  its  value  on  the  velocity  vector  is  equal  to  l.79 

B  The  optical-mechanical  analogy 

We  return  now  to  mechanics.  Here  the  trajectories  of  motion  are  also 
extremals  of  a  variational  principle,  and  one  can  construct  mechanics  as 
the  geometric  optics  of  a  many-dimensional  space,  as  Hamilton  did ;  we  will 
not  develop  this  construction  in  full  detail,  but  will  only  enumerate  those 
optical  concepts  which  led  Hamilton  to  basic  mechanical  concepts. 


Optics 

Mechanics 

Optical  medium 

Extended  configuration  space  {(q,  r)} 

Fermat’s  principle 

Hamilton’s  principle  S  j  L  dt  —  0 

Rays 

Trajectories  q(r) 

Indicatrices 

Lagrangian  L 

Normal  slowness  vector  p 

Momentum  p 

of  the  front 

Expression  of  p  in  terms  of 

Legendre  transformation 

the  velocity  of  the  ray,  q 

1  -form  p  dq 

1-form  p  dq  —  H  dt 

79  In  this  way,  the  vectors  p  corresponding  to  various  fronts  passing  through  a  given  point  are  not 
arbitrary,  but  are  subject  to  one  condition:  the  permissible  values  of  p  fill  a  hypersurface  in 
{p}-space  which  is  dual  to  the  indicatrix  of  velocities. 
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The  optical  length  of  the  path  Sqo(q)  and  Huygens’  principle  have  not  yet 
been  used.  Their  mechanical  analogues  are  the  action  function  and  the 
Hamilton- Jacobi  equation,  to  which  we  now  turn. 

C  Action  as  a  function  of  coordinates  and  time 
Definition.  The  action  function  S(q,  t)  is  the  integral 

Vto(<M)=  f  Ldt 

Jy 

along  the  extremal  y  connecting  the  points  (q0,  t0 )  and  (q,  t). 

In  order  for  this  definition  to  be  correct,  we  must  take  several  precautions: 
we  must  require  that  the  extremals  going  from  the  point  (q0,  t0)  do  not  inter¬ 
sect  elsewhere,  but  instead  form  a  so-called  “central  field  of  extremals” 
(Figure  198).  More  precisely,  we  associate  to  every  pair  (q0,  t)  a  point  (q,  t) 
which  is  the  end  of  the  extremal  with  initial  condition  q(0)  =  q0,  q(0)  =  q0- 
We  say  that  an  extremal  y  is  contained  in  a  central  field  if  the  mapping 
(q0,  t)  ->  (q,  0  is  nondegenerate  (at  the  point  corresponding  to  the  extremal 
y  under  consideration,  and  therefore  in  some  neighborhood  of  it). 


q 


It  can  be  shown  that  for  |  r  —  t0 1  small  enough  the  extremal  y  is  contained  in 
a  central  field.80 

We  now  look  at  a  sufficiently  small  neighborhood  of  the  endpoint  (q,  t) 
of  our  extremal.  Every  point  of  this  neighborhood  is  connected  to  (q0,  t0 ) 
by  a  unique  extremal  of  the  central  field  under  consideration.  This  extremal 
depends  differentiably  on  the  endpoint  (q,  t).  Therefore,  in  the  indicated 
neighborhood  the  action  function  is  correctly  defined 

Sqo>to(q,0=  \Ldt. 

Jy 

In  geometric  optics  we  were  looking  at  the  differential  of  the  optical 
length  of  a  path.  It  is  natural  here  to  look  at  the  differential  of  the  action 
function. 

80  Problem.  Show  that  this  is  not  true  for  large  t  —  t0.  Hint.  q  =  —q  (Figure  199). 
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4 


Figure  199  Extremal  with  a  focal  point  which  is  not  contained  in  any  central  field 

Theorem.  The  differential  of  the  action  function  ( for  a  fixed  initial  point )  is 
equal  to 

dS  =  p  dq  -  Hdt 

where  p  =  dL/dq  and  H  =  pq  —  L  are  defined  with  the  help  of  the  terminal 
velocity  q  of  the  trajectory  y. 

Proof.  We  lift  every  extremal  from  (q,  t)-space  to  the  extended  phase  space 
{(p,  q,  t)},  setting  p  =  dL/dq,  i.e.,  replacing  the  extremal  by  a  phase  trajectory. 
We  then  get  ann+  1 -dimensional  manifold  in  the  extended  phase  space 
consisting  of  phase  trajectories,  i.e.,  characteristic  curves  of  the  form 
p  dq  —  H  dt.  We  now  give  the  endpoint  (q,  t )  an  increment  (Aq,  At),  and 
consider  the  set  of  extremals  connecting  (q0 ,  f  0)  with  points  of  the  segment 
q  +  0Aq,f  +  0At,O  <  9  <  1  (Figure  200).  In  phase  space  we  get  a  quadrangle 
a  composed  of  characteristic  curves  of  the  form  p  dq  —  H  dt,  the  boundary 
of  which  consists  of  two  phase  trajectories  yx  and  y2,  a  segment  of  a  curve  a 
lying  in  the  space  (q  =  q0,  t  =  f0),  and  a  segment  of  a  curve  f}  projecting 
to  the  segment  (Aq,  At).  Since  a  consists  of  characteristic  curves  of  the 
form  p  dq  -  H  dt,  we  have 

0  =  ff  d{f  dq  -  H  dt)  =  f  p  dq  -  Hdt 

J J Q  J do 

=  f  -  f  +  f  -  \pdq~Hdt. 

v  yi  v  yy  J  ff  J  a 


'  72 


But,  on  the  segment  a,  we  have  dq  =  0 ,dt  =  0.  On  the  phase  trajectories  y  t  and 
y2>Pdq  —  Hdt  =  L  dt  (Section  45C).  So,  the  difference  jy2  —  JVl  pdq  -  Hdt 
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is  equal  to  the  increase  of  the  action  function,  and  we  find 

p  dq  -  H  dt  =  S(q  +  Aq,  t  +  At)  -  S(q,  f). 

Jp 

If  now  Aq  -+  0,  A t  ->  0,  then 

p  dq  —  H  dt  =  pAq  —  HAt  +  o(A t,  Aq) 

Jp 

which  proves  the  theorem.  □ 

The  form  p  dq  —  H  dt  was  formerly  introduced  to  us  artificially.  We  see 
now,  by  carrying  out  the  optical-mechanical  analogue,  that  it  arises  from 
examining  the  action  function  corresponding  to  the  optical  length  of  a  path. 

D  The  Hamilton-J acobi  equation 

Recall  that  the  “vector  of  normal  slowness  p”  cannot  be  altogether  arbitrary: 
it  is  subject  to  one  condition,  pq  =  1,  following  from  Huygens’  principle. 
An  analogous  condition  restricts  the  gradient  of  the  action  function  S. 

Theorem.  The  action  function  satisfies  the  equation 


This  nonlinear  first-order  partial  differential  equation  is  called  the 
Hamilton- Jacobi  equation. 

Proof.  It  is  sufficient  to  notice  that,  by  the  previous  theorem, 

f. 

The  relation  just  established  between  trajectories  of  mechanical  systems 
(“rays”)  and  partial  differential  equations  (“wave  fronts”)  can  be  used  in 
two  directions. 

First,  solutions  of  Equation  (1)  can  be  used  for  integrating  the  ordinary 
differential  equations  of  dynamics.  Jacobi’s  method  of  integrating  Hamilton’s 
canonical  equations,  presented  in  the  next  section,  consists  of  just  this. 

Second,  the  relation  of  the  ray  and  wave  points  of  view  allows  one  to 
reduce  integration  of  the  partial  differential  equations  (1)  to  integration 
of  a  hamiltonian  system  of  ordinary  differential  equations. 

Let  us  go  into  this  in  a  little  more  detail.  For  the  Hamilton-Jacobi 
equation  (1),  the  Cauchy  problem  is 

(2)  S( q,  t0)  =  S0( q)  ^  q,  t  j  =  0. 
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In  order  to  construct  a  solution  to  this  problem,  we  look  at  the  hamiltonian 
system 


8H_ 

dq 


8H 
q  ~  5p' 


We  consider  the  initial  conditions  (Figure  201): 

8S0 


q(t0)  =  flo  P(*o) 


8q 


qo 


The  solution  corresponding  to  these  equations  is  represented  in  (q,  f)-space 
by  the  curve  q  =  q(t),  which  is  the  extremal  of  the  principle  S  j  L  dt  =  0 
(where  the  lagrangian  L(q,  q,  t )  is  the  Legendre  transformation  with  respect 
to  p  of  the  hamiltonian  function  H( p,  q,  t )).  This  extremal  is  called  the 
characteristic  of  problem  (2),  emanating  from  the  point  q0 . 

If  the  value  tx  is  sufficiently  close  to  t0 ,  then  the  characteristics  emanating 
from  points  close  to  q0  do  not  intersect  for  t0  <  t  <  1 1?  |q  -  q0|  <  K. 
Furthermore,  the  values  of  q0  and  t  can  be  taken  as  coordinates  for  points 
in  the  region  |q  —  q0*|  <  R,  t0  <  t  <  t1  (Figure  201). 


Figure  201  Characteristics  for  a  solution  of  Cauchy’s  problem  for  the  Hamilton 
Jacobi  equation 

We  now  construct  the  “action  function  with  initial  condition  S0”: 

(3)  S(A )  =  50(q0)  +  f  Uq,  t)dt 

2qo.  ‘o 

(integrating  along  the  characteristic  leading  to  A). 

Theorem.  The  function  (3)  is  a  solution  of  problem  (2). 

Proof.  The  initial  condition  is  clearly  fulfilled.  The  fact  that  the  Hamilton- 
Jacobi  equation  is  satisfied  is  verified  just  as  in  the  theorem  on  differentials 
of  action  functions  (Figure  202). 

By  Stokes’  lemma,  j\;i  —  (  ^  —  fp  dq  —  H  dt  =  Q.  But  on  a,  Hdt  =  0  and  p  =  dS0/dq, 

SO 

Jp  dq  -  H  dt  =  Jp  dq  =  J  dS0  =  S0(q0  +  Aq)  -  S0(q0)- 
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r 


Figure  202  The  action  function  as  a  solution  of  the  Hamilton -Jacobi  equation 


Problem.  Draw  a  graph  of  the  multiple-valued  “functions"  S(q)  and  p(q)  for  t  =  t3  (Figure  201). 
Answer.  Cf.  Figure  203. 


5 


Figure  203  A  typical  singularity  of  a  solution  of  the  Hamilton  Jacobi  equation 
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The  point  of  self-intersection  of  the  graph  of  S  corresponds  on  the  graph  of  p  to  the  Maxwell 
line:  the  shaded  areas  are  equal.  The  graph  of  S(q<  t )  has  a  singularity  called  a  swallow's  tail  at  the 
point  (0,  t2). 


47  The  Hamilton-Jacobi  method  for  integrating 
Hamilton’s  canonical  equations 

In  this  paragraph  we  define  the  generating  function  of  a  free  canonical  transformation. 


The  idea  of  the  Hamilton-Jacobi  method  consists  of  the  following.  Under 
canonical  changes  of  coordinates,  the  canonical  form  of  the  equations  of 
motion  is  preserved,  as  is  the  hamiltonian  function  (Section  45A).  Therefore, 
if  we  succeed  in  finding  a  canonical  transformation  which  reduces  the 
hamiltonian  function  to  a  form  such  that  the  canonical  equations  can  be 
integrated,  then  we  can  also  integrate  the  original  canonical  equations.  It 
turns  out  that  the  problem  of  constructing  such  a  canonical  transformation 
reduces  to  the  determination  of  a  sufficiently  large  number  of  solutions  to 
the  Hamilton-Jacobi  partial  differential  equation.  The  generating  function 
of  the  desired  canonical  transformation  must  satisfy  this  equation. 

Before  turning  to  the  apparatus  of  generating  functions,  we  remark 
that  it  is  unfortunately  noninvariant  and  it  uses,  in  an  essential  way,  the  co¬ 
ordinate  structure  in  phase  space  {(p,  q)}.  It  is  necessary  to  use  the  apparatus 
of  partial  derivatives,  in  which  even  the  notation  is  ambiguous.81 

A  Generating  functions 

Suppose  that  the  2 n  functions  P(p,  q)  and  Q(p,  q)  of  the  2 n  variables  p  and  q 
give  a  canonical  transformation  g  :  R2"  -»■  R2".  Then  the  1-form  p  dq  —  P  dQ 
is  an  exact  differential  (Section  45A) : 

(1)  p  dq  —  P  dQ  —  dS(p,  q). 

Problem.  Show  the  converse:  if  this  form  is  an  exact  differential,  then  the  transformation  is 
canonical. 


We  now  assume  that,  in  a  neighborhood  of  some  point  (p0,  q0)>  we  can 
take  (Q,  q)  as  independent  coordinates.  In  other  words,  we  assume  that 
the  following  jacobian  is  not  zero  at  (p0,  q0): 


det 


m,  q) 
d(p,  q) 


=  det  ^  #  0. 

dp 


81  It  is  important  to  note  that  the  quantity  du/cx  on  the  x,  y-plane  depends  not  only  on  the 
function  which  is  taken  for  x,  but  also  on  the  choice  of  the  function  y:  in  new  variables  (x,  :) 
the  value  of  cujdx  will  be  different.  One  should  write 


cu 


8u 


cx 


\z  -  const 
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Such  canonical  transformations  will  be  called  free.  In  this  case,  the  function  S 
can  be  expressed  locally  in  these  coordinates : 

S( P,  q)  =  S,(Q,  q). 

Definition.  The  function  Sj(Q,  q)  is  called  a  generating  function  of  our 
canonical  transformation  g. 


We  emphasize  that  ^  is  not  a  function  on  the  phase  space  R2":  it  is  a 
function  on  a  region  in  the  direct  product  R£  x  Rq  of  two  n-dimensional 
coordinate  spaces,  whose  points  are  denoted  by  q  and  Q.  It  follows  from  (1) 
that  the  “partial  derivatives”  of  Si  are 


(2) 


dSy(Q,  q) 

dq  p 


and 


dSt(Q,q) 

dQ 


Conversely,  every  function  Si  gives  a  canonical  transformation  g  by 
formulas  (2). 


Theorem.  Let  S  j(Q,  q)  be  a  function  given  on  a  neighborhood  of  some  point 
(Qo>  q<>)  of  the  direct  product  of  two  n-dimensional  euclidean  spaces.  If 

d2Si 


det 


#  0, 


dQ  dq  |  q0,  qo 

then  Si  is  a  generating  function  of  some  free  canonical  transformation. 
Proof.  Consider  the  equation  for  the  Q  coordinates: 

dSi(Q,  q) 


dq 


P 


By  the  implicit  function  theorem  this  equation  can  be  solved  to  determine  a 
function  Q(p,  q)  in  a  neighborhood  of  the  point 


q0,  Po 


fcs,( Q,  q)  \  \ 

1  dq  /  Qo.qo/ 


(with  Q(p0,  qo)  =  Qo)-  the  determinant  we  need  here  is 

fd2Si( Q,  qf 


det  | 


8Q  dq 


Qo,  <io 


and  this  is  different  from  zero  by  hypothesis. 
We  now  consider  the  function 


P.(Q,q)=  <0, 

dQ 


and  set 


P(P,  q)  =  Pi(Q(p,q),  q) 


259 


9:  Canonical  formalism 


Then  the  local  map  g:  R2n  -►  R2"  sending  the  point  (p,  q)  to  the  point 
(P(p,  q),  Q(p,  q))  will  be  canonical  with  generating  function  since  by 
construction 


,  „  _  0st(Q,  q) ,  ^(q,  *»)  ,0 

p  dq  -  P  dQ  = - — dq  + ^ —  dQ. 


dq 


8Q 


It  is  free,  since  det(dQ/dp)  =  det^S^Q,  q)/<3Q  dq)  1  ^  0.  □ 


The  transformation  g:  R2"  -*•  R2"  is  given  in  general  by  2 n  functions  of 
In  variables.  We  see  that  a  canonical  transformation  is  given  entirely  by 
one  function  of  2 n  variables— its  generating  function.  It  is  easy  to  see  how 
useful  generating  functions  are  in  all  calculations  related  to  canonical  trans¬ 
formations.  This  becomes  even  more  so  as  the  number  of  variables,  2 n, 
becomes  large. 


B  The  Ham ilton -  Jacobi  equation  for  generating  functions 

We  notice  that  canonical  equations  in  which  the  hamiltonian  function 
depends  only  on  the  variable  Q  are  easy  to  integrate.  If  H  =  K( Q,  t),  then  the 
canonical  equations  have  the  form 


(3) 


Q  =  o 


p  = 


8K 

8Q 


from  which  we  have  immediately 


Q  (t)  =  Q(0) 


P(f)  =  P(0)  + 


I' 


dK 

dQ 


dt. 

Q(O) 


We  will  now  look  for  a  canonical  transformation  reducing  the  hamiltonian 
H(p,  q)  to  the  form  K( Q).  To  this  end  we  will  look  for  a  generating  function 
of  such  a  transformation,  S( Q,  q).  From  (2)  we  obtain  the  condition 


(4) 


where  after  differentiation  we  must  substitute  q(P,  Q)  for  q.  We  notice  that 
for  fixed  Q,  Equation  (4)  has  the  form  of  the  Hamilton- Jacobi  equation. 


Jacobi’s  theorem.  If  a  solution  S(Q,  q)  is  found  to  the  Hamilton- Jacobi  equa¬ 
tion  (4),  depending  on  n  parameters 82  Q;  and  such  that  det(d2S/5Qdq)  ^  0, 
then  the  canonical  equations 

dH  J  .  dH 

(5)  p=-^  and  q  =  ¥ 

can  be  solved  explicitly  by  quadratures.  The  functions  Q(p,  q)  determined 
by  the  equations  dS( Q,  q)/<9q  =  p  are  first  integrals  of  the  equation  (5). 


82  An  n-parameter  family  of  solutions  of  (4)  is  called  a  complete  integral  of  the  equation. 
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Proof.  Consider  the  canonical  transformation  with  generating  function 
S( Q,  q).  By  (2)  we  have  p  =  (dS/dq)(Q,  q),  from  which  we  can  determine 
Q(p,  q).  We  calculate  the  function  H( p,  q)  in  the  new  coordinates  P,  Q. 
We  have  H( p,  q)  =  H((dS/8 q)(Q,  q),  q).  In  order  to  find  the  hamiltonian 
function  in  the  new  coordinates  we  must  substitute  into  this  expression 
(after  differentiation)  for  q  its  expression  in  terms  of  P  and  Q.  However, 
by  (4),  this  expression  does  not  depend  on  q  at  all,  so  we  have  simply 

H( p,  q)  =  K( Q). 

Thus,  in  the  new  variables,  Equation  (5)  has  the  form  (3),  from  which  Jacobi’s 
theorem  follows  directly.  □ 

Jacobi’s  theorem  reduces  solving  the  system  of  ordinary  differential 
equations  (5)  to  finding  a  complete  integral  of  the  partial  differential  equation 
(4).  It  may  appear  surprising  that  this  “reduction”  from  the  simple  to  the 
complicated  provides  an  effective  method  for  solving  concrete  problems. 
Nevertheless,  it  turns  out  that  this  is  the  most  powerful  method  known  for 
exact  integration,  and  many  problems  which  were  solved  by  Jacobi  cannot  be 
solved  by  other  methods. 

C  Examples 

We  consider  the  problem  of  attraction  by  two  fixed  centers.  Interest  in  this 
problem  has  grown  recently  in  connection  with  the  study  of  the  motion  of 
artificial  earth  satellites.  It  is  fairly  clear  that  two  close  centers  of  attraction 
on  the  z-axis  approximate  attraction  by  an  ellipsoid  slightly  extended  along 
the  z-axis.  Unfortunately,  the  earth  is  not  prolate,  but  oblate.  To  overcome 
this  difficulty,  one  must  place  the  centers  at  imaginary  points  at  distances  +  is 
from  the  origin  along  the  z-axis.  Analytic  formulas  for  the  solution  are  true, 
of  course,  in  the  complex  region.  In  this  way  we  obtain  an  approximation 
to  the  earth’s  field  of  gravity,  in  which  the  equations  of  motion  can  be  exactly 
integrated  and  which  is  closer  to  reality  than  the  keplerian  approximation 
in  which  the  earth  is  a  point. 

For  simplicity  we  will  consider  only  the  planar  problem  of  attraction  by 
two  fixed  points  with  equal  masses.  The  success  of  Jacobi’s  method  is  based 
on  the  adoption  of  a  suitable  coordinate  system,  called  elliptic  coordinates. 
Suppose  that  the  distance  between  the  fixed  points  Oj  and  02  is  2c  (Figure 


Figure  204  Elliptical  coordinates 
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Figure  205  Confocal  ellipses  and  hyperbolas 

204),  and  that  the  distances  of  a  moving  mass  from  them  are  rt  and  r2,  re¬ 
spectively.  The  elliptic  coordinates  rj  are  defined  as  the  sum  and  difference 
of  the  distances  to  the  points  Oj  and  02:  £  =  r,  +  r2,  rj  =  —  r2. 


Problem.  Express  the  hamiltonian  function  in  elliptic  coordinates. 

Solution.  The  lines  c  =  const  are  ellipses  with  foci  at  O,  and  02 :  the  lines  i;  =  const  are  hyper¬ 
bolas  with  the  same  foci  (Figure  205).  They  are  mutually  orthogonal;  therefore, 

ds2  =  a 2  d£2  +  b2  dr]1. 

We  will  find  the  coefficients  a  and  b.  For  motion  along  an  ellipse  we  have  dr,  =  ds  cos  n.  and 
dr2  =  -ds  cos  a,  so  drj  =  2  cos  ads.  For  motion  along  a  hyperbola  we  have  dr,  =  ds  sin  a 
and  dr-i  =  ds  sin  a,  so  dc  =  2  sin  a  ds.  Thus  a  =  (2  sin  a)"  1  and  b  =  (2  cos  a)- Furthermore, 
from  the  triangle  0,M02  we  find  r\  +  r\  +  2r,r,  cos  2a  =  4c2,  which  implies 

4 c2  -  r\-r\ 

cos  a  —  sin  a  =  - , 

2r,r2 


cos2  a  +  sin2  a  = 


2r  ,r2 
2r,r2 


4c2  -  (r,  -  r2)2 


4r,r2 


(r,  +  r2)2  -  4c 2 
4r,r2 


But  if  ds2  =  £  a2  dqf,  then 


F  =  Zu2^,pI  =  u2d„//  =  I^+  U. 


Thus, 


H  =  p\ 


2  ( r\  +  r2)2  -  4c2  i  2  4c2  -  (r,  -  r2)  k  k 


2  r,r2 


+  p; 


2r,r2 


But  r,  +  r2  =  r,  —  r2  =  r 4r,r2  =  c2  -  172.  Therefore,  finally, 

C2  -  4c2  4c2  -  >j2  4/c 

H  —  ^Pt  z2  7  +  ^2  m2  -2  _  „2’ 


£2  -  P2 


<r  -  q2  c2  -  t 


We  will  now  solve  the  Hamilton-Jacobi  equation. 


Definition.  If,  in  the  equation 


8S 

?  %  ?  ?  *  •  '  >  Q  n 

oqn 


—  0, 
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the  variable  <jj  and  derivative  dS/dq^  appear  only  in  the  form  of  a  combina¬ 
tion  <p(dS/dqi,  q{),  then  we  say  that  the  variable  qx  is  separable. 


In  this  case  it  is  useful  to  look  for  a  solution  of  the  equation  of  the  form 

S  =  S^qi)  +  S'(q2,  ...,q„). 

By  setting  (pidS^/dq ,,  qt)  =  c,  in  this  equation,  we  obtain  an  equation  for  S' 
with  a  smaller  number  of  variables 


_  ds'  ds'  \ 

Let  S'  =  S'(q2, . . . ,  qn;  Ci,c)  be  a  family  of  solutions  to  this  equation 
depending  on  the  parameters  c,-.  The  functions  ct)  +  S'  will  satisfy 
the  desired  equation  if  S{  satisfies  the  ordinary  differential  equation 
(pidSJdq^  qt)  =  cy.  This  equation  is  easy  to  solve;  we  express  dS1/dq1 
in  terms  of  qx  and  cx  to  obtain  dSl/dq1  =  i KquCx),  from  which  Sx  = 

I"1  Ci)dqi. 

If  one  of  the  variables,  say  q2 ,  is  separable  in  the  new  equation  (with  fl>2) 
we  can  repeat  this  procedure  and  (in  the  most  favorable  case)  we  can  find 
a  solution  of  the  original  equation  depending  on  n  constants 

Si(?i ;  ct)  +  S2(q2 ;  Ci,  c2)  +  •  •  •  +  Sn(q„ ;  clt . . . ,  cn ). 

In  this  case  we  say  that  the  variables  are  completely  separable. 

If  the  variables  are  completely  separable,  then  a  solution  depending  on  n 
parameters  of  the  Hamilton-Jacobi  equation,  (D^dS/dq,  q)  =  0,  is  found  by 
quadratures.  But  then  the  corresponding  system  of  canonical  equations  can 
also  be  integrated  by  quadratures  (Jacobi’s  theorem). 

We  apply  the  above  to  the  problem  of  two  fixed  centers.  The  Hamilton- 
Jacobi  equation  (4)  has  the  form 

(1)  “  4c2)  +  (^)  (4c2  "  ^  =  ~  +  4k^ 

We  can  separate  variables  by,  for  instance,  setting 

(^)  (£2  -  4c2)  -  4kt  -  U2  =  Cl 


and 


dS 


dt1. 


(4c2  -  rj2)  +  Kr}2  =  -cx. 


Then  we  find  the  complete  integral  of  Equation  (4)  in  the  form 


S(Z,  >?;  ci. 


dZ  + 


f  l-Cj  -  c2r}2 

J  V  4 c2  -  r}2 


dq. 
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Jacobi’s  theorem  now  gives  an  explicit  expression,  in  terms  of  elliptic 
integrals,  for  motion  in  the  problem  of  two  fixed  centers.  A  more  detailed 
investigation  of  this  motion  can  be  found  in  Charlier’s  book  “Die  Mechanik 
des  Himmels,”  Berlin,  Leipzig,  W.  de  Gruyter  &  Co.,  1927. 

Another  application  of  the  problem  of  the  attraction  of  two  fixed  centers  is 
the  study  of  motion  with  fixed  pull  in  a  field  with  one  attracting  center. 

This  is  a  question  of  the  motion  of  a  point  mass  under  the  action  of  a 
newtonian  attraction  of  a  fixed  center  and  one  more  force  (“pull”)  of  con¬ 
stant  magnitude  and  direction.  This  problem  can  be  looked  at  as  the  limiting 
case  of  the  problem  of  attraction  by  two  fixed  centers.  In  the  passage  to 
the  limit,  one  center  goes  off  to  infinity  in  the  direction  of  the  thrust  force 
(during  which  its  mass  must  grow  proportionally  to  the  square  of  the  distance 
moved  in  order  to  guarantee  constant  pull). 

This  limiting  case  of  the  problem  of  the  attraction  of  two  fixed  centers 
can  be  integrated  explicitly  (in  elliptic  functions).  We  can  convince  ourselves 
of  this  by  passing  to  a  limit  or  by  directly  separating  variables  in  the  problem 
of  motion  with  constant  pull  in  a  field  with  one  center.  The  coordinates 
in  which  the  variables  are  separated  in  this  problem  are  obtained  as  the 
limit  of  elliptic  coordinates  as  one  of  the  centers  approaches  infinity.  They 
are  called  parabolic  coordinates  and  are  given  by  the  formulas 

u  —  r  —  x  v  =  r  +  x 

(the  pull  is  directed  along  the  x-axis). 

A  description  of  the  trajectories  of  a  motion  with  constant  pull  (many 
of  which  are  very  intricate)  can  be  found  in  V.  V.  Beletzkii’s  book  “  Sketches 
of  motions  of  celestial  bodies,”  Nauka,  1972. 

As  one  more  example  we  consider  the  problem  of  geodesics  on  a  triaxial 
ellipsoid.83  Here  Jacobi’s  elliptical  coordinates  Xu  X2,  and  X3  are  helpful, 
where  the  Xt  are  the  roots  of  the  equation 

+ -*Lj  + -  i, 

a{  +  X  a2  +  X  «3  +  X 

xls  x2,  and  x3  are  cartesian  coordinates.  We  will  not  carry  out  the  computa¬ 
tions  showing  that  the  variables  are  separable  (they  can  be  found,  for  example, 
in  Jacobi’s  “Lectures  on  dynamics”),  but  will  mention  only  the  result:  we 
will  describe  the  behavior  of  the  geodesics. 

The  surfaces  Xl  =  const,  X2  =  const,  and  X3  =  const  are  surfaces  of 
second  degree,  called  confocal  quadrics.  The  first  of  these  is  an  ellipsoid,  the 
second  a  hyperboloid  of  one  sheet,  and  the  third  a  hyperboloid  of  two  sheets. 
The  ellipsoid  can  degenerate  into  the  interior  of  an  ellipse,  the  one-sheeted 
hyperboloid  either  into  the  exterior  of  an  ellipse  or  into  the  part  of  a  plane 

83  The  problem  of  geodesics  on  an  ellipsoid  and  the  closely  related  problem  of  ellipsoidal 
billiards  have  found  application  in  a  series  of  recent  results  in  physics  connected  with  laser 
devices. 
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between  the  branches  of  a  hyperbola,  and  the  two-sheeted  hyperboloid 
either  into  the  part  of  a  plane  outside  the  branches  of  a  hyperbola  or  into  a 
plane. 

Suppose  that  the  ellipsoid  under  consideration  is  one  of  the  ellipsoids 
in  the  family  with  semiaxes  a  >  b  >  c.  Each  of  the  three  ellipses  =  0, 
x2  =  0,  and  x3  =  0  is  a  closed  geodesic.  A  geodesic  starting  from  a  point 
of  the  largest  ellipse  (with  semiaxes  a  and  b )  in  a  direction  close  to  the 
direction  of  the  ellipse  (Figure  206),  is  alternately  tangent  to  the  two  closed 
lines  of  intersection  of  the  ellipsoid  with  the  one-sheeted  hyperboloid  of  our 
family  k  =  const.84  This  geodesic  is  either  closed  or  is  dense  in  the  area 


Figure  206  Geodesic  on  a  triaxial  ellipsoid 


Figure  207  Geodesics  emanating  from  an  umbilical  point 

between  the  two  lines  of  intersection.  As  the  slope  of  the  geodesic  increases, 
the  hyperboloids  collapse  down  to  the  region  “inside”  the  hyperbola  which 
intersects  our  ellipsoid  in  its  four  “umbilical  points.”  In  the  limiting  case 
we  obtain  geodesics  passing  through  the  umbilical  points  (Figure  207). 

It  is  interesting  to  note  that  all  the  geodesics  starting  at  an  umbilical 
point  again  converge  at  the  opposite  umbilical  point,  and  all  have  the  same 
length  between  the  two  umbilical  points.  Only  one  of  these  geodesics  is  closed, 
namely,  the  middle  ellipse  with  semiaxes  a  and  c.  If  we  travel  along  any 
other  geodesic  passing  through  an  umbilical  point  in  any  direction,  we  will 
approach  this  ellipse  asymptotically. 

Finally,  geodesics  which  intersect  the  largest  ellipse  even  more  “steeply” 
(Figure  208)  are  alternately  tangent  to  the  two  lines  of  intersection  of  our 

84  These  lines  of  intersection  of  the  confocal  surfaces  are  also  lines  of  curvature  of  the  ellipsoid. 


265 


9:  Canonical  formalism 


Figure  208  Geodesics  of  an  ellipsoid  which  are  tangent  to  a  two-sheeted  hyperboloid 

ellipsoid  with  a  two-sheeted  hyperboloid.85  In  general,  they  are  dense  in  the 
region  between  these  lines.  The  small  ellipse  with  semiaxes  b  and  c  is  among 
these  geodesics. 

“The  main  difficulty  in  integrating  a  given  differential  equation  lies  in 
introducing  convenient  variables,  which  there  is  no  rule  for  finding.  There¬ 
fore,  we  must  travel  the  reverse  path  and  after  finding  some  noticeable 
substitution,  look  for  problems  to  which  it  can  be  successfully  applied.” 
(Jacobi,  “Lectures  on  dynamics”). 

A  list  of  problems  admitting  separation  of  variables  in  spherical,  elliptical, 
and  parabolic  coordinates  is  given  in  Section  48  of  Landau  and  Lifshitz’s 
“Mechanics”  (Oxford,  Pergamon,  1960). 


48  Generating  functions 

In  this  paragraph  we  construct  the  apparatus  of  generating  functions  for  non-free  canonical 
transformations. 

A  The  generating  function  52(P,  q) 

Let/:  R2"  -*■  R2"  be  a  canonical  transformation  with  g( p,  q)  =  (P,  Q).  By 
the  definition  of  canonical  transformation  the  differential  form  on  R2" 

p  dq  —  P  dQ  =  dS 

is  the  total  differential  of  some  function  S( p,  q).  A  canonical  transformation  is 
free  if  we  can  take  q,  Q  as  2 n  independent  coordinates.  In  this  case  the 
function  S  expressed  in  the  coordinates  q  and  Q  is  called  a  generating  function 
S\(q,  Q).  Knowing  this  function  alone,  we  can  find  all  2 n  functions  giving  the 
transformation  from  the  relations 


(1) 


£Si(q,  Q) 

P  dq 


and 


P 


dSM,  Q) 

dQ 


It  is  far  from  the  case  that  all  canonical  transformations  are  free.  For 
example,  in  the  case  of  the  identity  transformation  q  and  Q  =  q  are  depen¬ 
dent.  Therefore,  the  identity  transformation  cannot  be  given  by  a  generating 


85  These  are  also  lines  of  curvature. 
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function  S^q,  Q).  We  can,  however,  obtain  generating  functions  of  another 
form  by  means  of  the  Legendre  transformation.  Suppose,  for  instance, 
that  we  can  take  P,  q  as  independent  local  coordinates  on  1R2"  (i.e.,  the 
determinant  det(<9(P,  q)/<3(p,  q))  =  det(<3P/dp)  is  not  zero).  Then  we  have 

p  dq  -  P  dQ  =  dS  and  p  dq  +  Q  dP  =  d( PQ  +  S). 

The  quantity  PQ  +  S,  expressed  in  terms  of  (P,  q),  is  also  called  a  generating 
function 


s2(  p,  q)  =  PQ  +  S(p,  q). 

For  this  function,  we  find 

<?S2(P,  q) 


(2) 


P  = 


and  Q  = 


dS2( P,  q) 


dq  ^  dP 

Conversely,  if  S2(P,  q)  is  any  function  for  which  the  determinant 

d2S2( P,  q)' 


det  I 


<3q  dP 


Po.qo 


is  not  zero,  then  in  a  neighborhood  of  the  point 

'dS2{ P,  q)N 


Po 


5q 


»  MO 


P  o.  4o 


we  can  solve  the  first  group  of  equations  (2)  for  P  and  obtain  a  function 
P(p,  q)  (where  P(p0 ,  q0)  =  P0).  After  this,  the  second  group  of  equations  (2) 
determine  Q(p,  q),  and  the  map  (p,  q)  -*  (P,  Q)  is  canonical  (prove  this!). 

Problem.  Find  a  generating  function  S2  for  the  identity  map  P  =  p,  Q  =  q. 

Answer.  Pq 


Remark.  The  generating  function  S2(P,  q)  is  convenient  also  because  there  are  no  minus 
signs  in  the  formulas  (2),  and  they  are  easy  to  remember  if  we  remember  that  the  generating 
function  of  the  identity  transformation  is  Pq 

B  2"  generating  functions 

Unfortunately,  the  variables  P,  q  cannot  always  be  chosen  for  local  co¬ 
ordinates  either;  however,  we  can  always  choose  some  set  of  n  new  co¬ 
ordinates 

p,  = 

so  that  together  with  the  old  q  we  obtain  2 n  independent  coordinates. 

Here  (i r1, . . . ,  ifc)  0'i»  •••,;„-*)  is  any  partition  of  the  set  (1, . . . ,  n )  into 
two  non-intersecting  parts ;  so  there  are  in  all  2"  cases. 
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Theorem.  Let  g:U2n^>U2n  be  a  canonical  transformation  given  by  the 
functions  P(p,  q)  and  Q(p,  q).  In  a  neighborhood  of  every  point  (p0,  q0)  at 
least  one  of  the  T  sets  of  functions  (P„  Qj,  q)  can  be  taken  as  independent 
coordinates  on  M2n: 


det 


d(P;,  Qj,  q) 
d(Pi,  P j,  q) 


det 


d(Pt,  Qj) 

d(P;,  Pj) 


*0. 


In  a  neighborhood  of  such  a  point,  the  canonical  transformation  g  can  be 
reconstructed  from  the  function 


S3( Pi5  Qj,  q)  =  (P M  +  ^pdq-PdQ 


by  the  relations 


(3) 


and  Pj  =  — 


Conversely ,  if  S3( Pf,  Qj,  q)  is  any  function  for  which  the  determinant 
det(52S3/dP  5q)|Po.,0  (P  =  P;,  Qj)  is  not  zero,  then  the  relations  (3)  give  a 
canonical  transformation  in  a  neighborhood  of  the  point  Po ,  qo  • 


Proof.  The  proof  of  this  theorem  is  almost  the  same  as  the  one  carried  out 
above  in  the  particular  case  k  —  n.  We  need  only  verify  that  the  determinant 
det[(d(P£,  Qj)/5(pi,  Pj))]  is  not  zero  for  one  of  the  2".sets  (Pf,  Qj,  q). 


We  consider  the  differential  of  our  transformation  g  at  the  point  (p0,  <lo)-  %  identifying  the 
tangent  space  to  R2"  with  R2",  we  can  consider  dg  as  a  symplectic  transformation  S  :  R2"  ->  R  n. 

Consider  the  coordinate  p-plane  P  in  R2"  (Figure  209).  This  is  a  null  n-plane,  and  its  image  SP 
is  also  a  null  plane.  We  project  the  plane  SP  onto  the  coordinate  plane  <r  =  {(p,,  q^)}  parallel  to 
the  remaining  coordinate  axes,  i.e.,  in  the  direction  of  the  n-dimensional  null  coordinate  plane 
a  =  {(p;,  q;)}.  We  denote  the  projection  operator  by  TS :  P  -*■  ff. 

The  condition  det(d(P„  Q^(pf,  Pj))  #  0  means  that  T.SP  -  <r  is  nonsingular.  The  operator 
S  is  nonsingular.  Therefore,  TS  is  nonsingular  if  and  only  if  T:  SP  ->  a  is  nonsingular.  In  other 
words,  the  null  plane  SP  must  be  transverse  to  the  null  coordinate  plane  a.  But  we  showed  in 


Figure  209  Checking  non-degeneracy 


268 


48:  Generating  functions 


Section  41  that  at  least  one  of  the  2"  null  coordinate  planes  is  transverse  to  SP.  This  means  that 
one  of  our  2"  determinants  is  nonzero,  as  was  to  be  shown.  □ 

Problem.  Show  that  this  system  of  2"  types  of  generating  functions  is  minimal :  given  any  one  of 
the  2  determinants,  there  exists  a  canonical  transformation  for  which  only  this  determinant  is 
nonzero.86 


C  Infinitesimal  canonical  transformations 

We  now  consider  a  canonical  transformation  which  is  close  to  the  identity. 
Its  generating  function  can  be  taken  close  to  the  generating  function  Pq 
of  the  identity.  We  look  at  a  family  of  canonical  transformations  gt  depending 
differentiably  on  the  parameter  e,  such  that  the  generating  functions  have 
the  form 

(4)  Pq  +  £S(P,q;e)  p=P+£^  Q  =  q  +  £^. 

o  q  dV 

An  infinitesimal  canonical  transformation  is  an  equivalence  class  of  families 
g£,  two  families  g£  and  h£  being  equivalent  if  their  difference  is  small  of  higher 
than  first  order,  |  ge  -  h£  |  =  0(£2),  e  -+  0. 


Theorem.  An  infinitesimal 
differential  equations 


canonical  transformation  satisfies  Hamilton's 


dP 


dH  dQ 


de 


E  =  0 


de 


£  =  0 


with  Hamiltonian  function  H( p,  q)  =  S(p,  q,  0). 


dH 

dp 


Proof.  The  result  follows  from  formula  (4):  P  -*•  p  as  e  -*•  0.  □ 


Corollary.  A  one-parameter  group  of  transformations  of  phase  space  R2" 
satisfies  Hamilton’s  canonical  equations  if  and  only  if  the  transformations 
are  canonical. 


q 


Figure  210  Geometric  meaning  of  Hamilton’s  function 


86  The  number  of  kinds  of  generating  functions  in  different  textbooks  ranges  from  4  to  4". 
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The  hamiltonian  function  H  is  called  the  “generating  function  of  the 
infinitesimal  canonical  transformation.”  We  notice  that  unlike  the  generating 
function  S,  the  function  H  is  a  function  of  points  of  phase  space,  invariantly 
associated  to  the  transformation. 

The  function  H  has  a  simple  geometric  meaning.  Let  x  and  y  be  two  points 
in  IR2"  (Figure  210),  y  a  curve  connecting  them,  and  dy  =  y  —  x.  Consider 
the  images  of  the  curve  y  under  the  transformations  gz,  0  <  r  <  e;  they 
form  a  band  o(e).  Now  consider  the  integral  of  the  form  co2  =  £  dPi  A  dqt 
over  the  2-chain  tr,  using  the  fact  that  da  —  g£y  —  y  +  gzx  —  gty. 

Problem.  Show  that 


lim-ff  co2  =  H(x)  -  H(y) 

e-*0  ^ 

exists  and  does  not  depend  on  the  representative  of  the  class  gc. 


From  this  result  we  once  more  obtain  the  well-known 

Corollary.  Under  canonical  transformations  the  canonical  equations  retain 
their  form,  with  the  same  hamiltonian  function. 

Proof.  We  computed  the  variation  of  the  hamiltonian  function  using  only 
an  infinitesimal  canonical  transformation  and  the  symplectic  structure  of 
R2n— the  form  co2.  ^ 
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Perturbation  theory  consists  of  a  very  useful  collection  of  methods  for  finding 
approximate  solutions  of  “perturbed”  problems  which  are  close  to  com¬ 
pletely  solvable  “non-perturbed”  problems.  These  methods  can  be  easily 
justified  if  we  are  investigating  motion  over  a  small  interval  of  time.  Relatively 
little  is  known  about  how  far  we  can  trust  the  conclusions  of  perturbation 
theory  in  investigating  motion  over  large  or  infinite  intervals  of  time. 

We  will  see  that  the  motion  in  many  “non-perturbed  ”  integrable  problems 
turns  out  to  be  conditionally  periodic.  In  the  study  of  unperturbed  problems, 
and  even  more  so  in  the  study  of  the  perturbed  problems,  special  symplectic 
coordinates,  called  “action-angle”  variables,  are  useful.  In  conclusion,  we 
will  prove  a  theorem  justifying  perturbation  theory  for  single-frequency 
systems  and  will  prove  the  adiabatic  invariance  of  action  variables  in  such 
systems. 

49  Integrable  systems 

In  order  to  integrate  a  system  of  2 n  ordinary  differential  equations,  we  must  know  2 n  first 
integrals.  It  turns  out  that  if  we  are  given  a  canonical  system  of  differential  equations,  it  is  often 
sufficient  to  know  only  n  first  integrals— each  of  them  allows  us  to  reduce  the  order  of  the  system 
not  just  by  one,  but  by  two. 

A  Liouville's  theorem  on  integrable  systems 

Recall  that  a  function  F  is  a  first  integral  of  a  system  with  hamiltonian 
function  H  if  and  only  if  the  Poisson  bracket 

(H,  F)  =  0 

is  identically  equal  to  zero. 
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Definition.  Two  functions  Fj  and  F 2  on  a  symplectic  manifold  are  in  involution 
if  their  Poisson  bracket  is  equal  to  zero. 

Liouville  proved  that  if,  in  a  system  with  n  degrees  of  freedom  (i.e.,  with 
a  2n-dimensional  phase  space),  n  independent  first  integrals  in  involution 
are  known,  then  the  system  is  integrable  by  quadratures. 

Here  is  the  exact  formulation  of  this  theorem :  Suppose  that  we  are  given  n 
functions  in  involution  on  a  symplectic  2n-dimensional  manifold 

(Fi,  Fj)  =  0,  U  =  l,2, 

Consider  a  level  set  of  the  functions  F, 

Mf  =  { x:Fi(x )  =  fit  i=  1,  ■  ■  ■ .  «}■ 

Assume  that  the  n  functions  F;  are  independent  on  Mf  (i.e.,  the  n  1 -forms 
dFt  are  linearly  independent  at  each  point  of  Mf).  Then 

1.  Mf  is  a  smooth  manifold,  invariant  under  the  phase  flow  with  hamiltonian 
function  H  =  F 

2.  If  the  manifold  Mf  is  compact  and  connected,  then  it  is  difleomorphic 
to  the  n-dimensional  torus 

T"  =  {(<?!, . . . ,  q>„) mod  2;r}. 

3.  The  phase  flow  with  hamiltonian  function  H  determines  a  conditionally 
periodic  motion  on  Mf,  i.e.,  in  angular  coordinates  <p  =  (<pl5 ...,  <p„) 
we  have 

^  =  M,  0)  =  (0(f). 

at 

4.  The  canonical  equations  with  hamiltonian  function  H  can  be  integrated 
by  quadratures. 

Before  proving  this  theorem,  we  note  a  few  of  its  corollaries. 

Corollary  1.  If  in  a  canonical  system  with  two  degrees  of  freedom ,  a  first 
integral  F  is  known  which  does  not  depend  on  the  hamiltonian  H,  then  the 
system  is  integrable  by  quadratures;  a  compact  connected  two-dimensional 
submanifold  of  the  phase  space  H  =  h,  F  =  f  is  an  invariant  torus,  and 
motion  on  it  is  conditionally  periodic. 

Proof.  F  and  H  are  in  involution  since  F  is  a  first  integral  of  a  system  with 
hamiltonian  function  H.  ^ 

As  an  example  with  three  degrees  of  freedom,  we  consider  a  heavy  sym¬ 
metric  Lagrange  top  fixed  at  a  point  on  its  axis.  Three  first  integrals  are 
immediately  obvious:  H,  Mz,  and  M3.  It  is  easy  to  verify  that  the  integrals 
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Mz  and  M3  are  in  involution.  Furthermore,  the  manifold  H  =  h  in  the  phase 
space  is  compact.  Therefore,  we  can  immediately  say,  without  any  calcula¬ 
tions,  that  for  the  majority  of  initial  conditions87  the  motion  of  the  top  is 
conditionally  periodic:  the  phase  trajectories  fill  up  the  three-dimensional 
torus  H  =  ci,  Mz  —  c2,  M3  =  c3.  The  corresponding  three  frequencies  are 
called  frequencies  of  fundamental  rotation,  precession,  and  nutation. 

Other  examples  arise  from  the  following  observation:  if  a  canonical 
system  can  be  integrated  by  the  method  of  Hamilton- Jacobi,  then  it  has  n 
first  integrals  in  involution .  The  method  consists  of  a  canonical  transformation 
(p,  q)  ->  (P,  Q)  such  that  the  Qt  are  first  integrals.  But  the  functions  Qt 
and  Qj  are  clearly  in  involution. 

In  particular,  the  observation  above  applies  to  the  problem  of  attraction 
by  two  fixed  centers.  Other  examples  are  easily  found.  In  fact,  the  theorem 
of  Liouville  formulated  above  covers  all  the  problems  of  dynamics  which 
have  been  integrated  to  the  present  day. 

B  Beginning  of  the  proof  of  Liouville' s  theorem 

We  turn  now  to  the  proof  of  the  theorem.  Consider  the  level  set  of  the 
integrals : 


M{  =  {x:Ft  =  f,  i  =  1, . .  • , 

By  hypothesis,  the  n  1 -forms  dF{  are  linearly  independent  at  each  point  of 
Mf\  therefore,  by  the  implicit  function  theorem,  Mf  is  an  n-dimensional 
submanifold  of  the  2n-dimensional  phase  space. 

Lemma  1.  On  the  n-dimensional  manifold  Mf  there  exist  n  tangent  vector 
fields  which  commute  with  one  another  and  which  are  linearly  independent 
at  every  point. 

Proof.  The  symplectic  structure  of  phase  space  defines  an  operator  /  taking 
1 -forms  to  vector  fields.  This  operator  /  carries  the  1-form  dF{  to  the  field 
/  dFi  of  phase  velocities  of  the  system  with  hamiltonian  function  F, .  We 
will  show  that  the  n  fields  I  dF ,  are  tangent  to  Mt,  commute,  and  are  inde¬ 
pendent. 

The  independence  of  the  I  dFt  at  every  point  of  Mf  follows  from  the  inde¬ 
pendence  of  the  dFt  and  the  nonsingularity  of  the  isomorphism  /.  The 
fields  /  dFt  commute  with  one  another,  since  the  Poisson  brackets  of  their 
hamiltonian  functions  ( FifFj )  are  identically  0.  For  the  same  reason,  the 
derivative  of  the  function  F;  in  the  direction  of  the  field  I  dFj  is  equal  to  zero 
for  any  i,j  =  1, . . . ,  n.  Thus  the  fields  /  dFt  are  tangent  to  Mf,  and  Lemma  1 
is  proved.  □ 

8"  The  singular  level  sets,  where  the  integrals  are  not  functionally  independent,  constitute  the 
exception. 
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We  notice  that  we  have  proved  even  more  than  Lemma  1 : 

1'.  The  manifold  Mf  is  invariant  with  respect  to  each  of  the  n  commuting 
phase  flows  g\  with  hamiltonian  functions  =  gSjg\. 

1".  The  manifold  Mf  is  null  (i.e.,  the  2-form  to2  is  zero  on  TMf\  x). 

This  is  true  since  the  n  vectors  I  dFt\x  are  skew-orthogonal  to  one  another 
(( Ft ,  Fj )  =  0)  and  form  a  basis  of  the  tangent  plane  to  the  manifold  Mf  at 
the  point  x. 

C  Manifolds  on  which  the  action  of  the  group 
R"  is  transitive 

We  will  now  use  the  following  topological  proposition  (the  proof  is  completed 
in  Section  D). 

Lemma  2.  Let  Mn  be  a  compact  connected  differentiable  n-dimensional  mani¬ 
fold,  on  which  we  are  given  n  pairwise  commutative  and  linearly  independent 
at  each  point  vector  fields.  Then  Mn  is  diffeomorphic  to  an  n-dimensional 
torus. 

Proof.  We  denote  by  g\,  i  —  1, . . . ,  n,  the  one-parameter  groups  of  diffeo- 
morphisms  of  M  corresponding  to  the  n  given  vector  fields.  Since  the  fields 
commute,  the  groups  g\  and  g)  commute.  Therefore,  we  can  define  an  action  g 
of  the  commutative  group  IRn  =  {t}  on  the  manifold  M  by  setting 

g'.M^M  g'  =  g'{  ■  (t  =  (tu  . . . ,  tH)  e  IR"). 

Clearly,  gt+s  =  gtgs,  t,se  IR".  Now  fix  a  point  x0  e  M.  Then  we  have  a  map 

g:Mn-+M  g(t)  =  gtx0. 

(The  point  x0  moves  along  the  trajectory  of  the  first  flow  for  time  tu  along 
the  second  flow  for  time  t2,  etc.) 

Problem  1.  Show  that  the  map  g  (Figure  211)  of  a  sufficiently  small  neighborhood  V  of  the 
point  Oe  R"  gives  a  chart  in  a  neighborhood  of  x0:  every  point  x0eM  has  a  neighborhood 
U  (xq  e  U  <=  M)  such  that  g  maps  V  diffeomorphically  onto  U. 

Hint.  Apply  the  implicit  function  theorem  and  use  the  linear  independence  of  the  fields  at  x0 . 

Problem  2.  Show  that  g  \  R"  ->  M  is  onto. 
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Hint.  Connect  a  point  x  e  M  with  x0  by  a  curve  (Figure  212),  cover  the  curve  by  a  finite 
number  of  the  neighborhoods  U  of  the  preceding  problem  and  define  t  as  the  sum  of  shifts  t, 
corresponding  to  pieces  of  the  curve. 

We  note  that  the  map  g:Rn  ->  M"  cannot  be  one-to-one  since  M"  is 
compact  and  IR"  is  not.  We  will  examine  the  set  of  pre-images  of  x0  e  M". 

Definition.  The  stationary  group  of  the  point  x0  is  the  set  T  of  points  t  e  R" 
for  which  gtx0  =  x0 . 


Problem  3.  Show  that  T  is  a  subgroup  of  the  group  R",  independent  of  the  point  x0. 

_  Solution.  If  g’x0  =  v0  and  g'x0  =  ,v0,  then  gs  +  ,x0  =  g*g'x 0  =  gsx0  =  x0  and  g-'x0  = 
g  Yx0  =  x0.  Therefore,  T  is  a  subgroup  of  R".  If  x  =  grx0  and  t  e  T,  then  g'x  =  g'  +  rx0  = 
g'tfx  o  =  9rx  o  =  -x- 

In  this  way  the  stationary  group  T  is  a  well-defined  subgroup  of  IR" 
independent  of  the  point  x0 .  In  particular,  the  point  t  =  0  clearly  belongs 
to  n 


Problem  4.  Show  that,  in  a  sufficiently  small  neighborhood  V  of  the  point  0  £  R",  there  is  no 
point  of  the  stationary  group  other  than  t  =  0. 

Hint.  The  mapg:  V  -*  U  is  a  diffeomorphism. 

Problem  5.  Show  that,  in  the  neighborhood  t  +  V  of  any  point  teTcH",  there  is  no  point  of 
the  stationary  group  T  other  than  t.  (Figure  213) 


Thus  the  points  of  the  stationary  group  T  lie  in  IR"  discretely.  Such  sub¬ 
groups  are  called  discrete  subgroups. 
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Figure  214  A  discrete  subgroup  of  the  plane 

Example.  Let  e,, . . . ,  et  be  k  linearly  independent  vectors  in  IR",  0  <  k  <  n. 
The  set  of  all  their  integral  linear  combinations  (Figure  214) 

m,e,  +  •  •  •  +  mke*,  m;  g  Z  =  (...,  —2,  -  1,  0,  1, . . .) 

forms  a  discrete  subgroup  of  IR".  For  example,  the  set  of  all  integral  points 
in  the  plane  is  a  discrete  subgroup  of  the  plane. 

D  Discrete  subgroups  in  KT 

We  will  now  use  the  algebraic  fact  that  the  example  above  includes  all  discrete 
subgroups  of  IR".  More  precisely,  we  will  prove 

Lemma  3.  Let  T  be  a  discrete  subgroup  of  IR".  Then  there  exist  k  (0  <  k  <  n) 
linearly  independent  vectors  e,, . . . ,  ek  g  T  such  that  T  is  exactly  the  set  of 
all  their  integral  linear  combinations. 

Proof.  We  will  consider  IR"  with  some  euclidean  structure.  We  always 
have  0  g  T.  If  T  =  {0}  the  lemma  is  proved.  If  not,  there  is  a  point  e0  g  T, 
e0^0  (Figure  215).  Consider  the  line  Re0.  We  will  show  that  among  the 
elements  of  T  on  this  line,  there  is  a  point  e,  which  is  closest  to  0.  In  fact, 
in  the  disk  of  radius  |  e0 1  with  center  0,  there  are  only  a  finite  number  of  points 
of  T  (as  we  saw  above,  every  point  x  of  T  has  a  neighborhood  V  of  standard 
size  which  does  not  contain  any  other  point  of  T).  Among  the  finite  number 
of  points  of  T  inside  this  disc  and  lying  on  the  line  IRe0 ,  the  point  closest  to  0 
will  be  the  closest  point  to  0  on  the  whole  line.  The  integral  multiples  of  this 
point  e,  (me,,  me  Z)  constitute  the  intersection  of  the  line  IRe0  with  T. 


Figure  215  Proof  of  the  lemma  on  discrete  subgroups 


276 


49:  Integrable  systems 


In  fact,  the  points  mel  divide  the  line  into  pieces  of  length  | Cj  |.  If  there  were 
a  point  eeT  inside  one  of  these  pieces  (mel,(m  +  1  )et ),  then  the  point 
e  —  me!  e  T  would  be  closer  to  0  than 

If  there  are  no  points  of  r  off  the  line  Rel5  the  lemma  is  proved.  Suppose 
there  is  a  point  e  e  T,  e  £  Re!.  We  will  show  that  there  is  a  point  e2  e  T  closest 
to  the  line  Rej  (but  not  lying  on  the  line).  We  project  e  orthogonally  onto  Re!. 
The  projection  lies  in  exactly  one  interval  A  =  {Aej},  m  <  l  <  m  +  1. 
Consider  the  right  circular  cylinder  C  with  axis  A  and  radius  equal  to  the 
distance  from  A  to  e.  In  this  cylinder  lie  a  finite  (nonempty)  number  of  points 
of  the  group  T.  Let  e2  be  the  closest  one  to  the  axis  Rei  not  lying  on  the  axis. 


Problem  6.  Show  that  the  distance  from  this  axis  to  any  point  e  of  T  not  lying  on  Re,  is  greater 
than  or  equal  to  the  distance  of  e2  from  Re,. 

Hint.  By  a  shift  of  me,  we  can  move  the  projection  of  e  onto  the  axis  interval  A. 


The  integral  linear  combinations  of  ej  and  e2  form  a  lattice  in  the  plane 
Rei  T  Re2. 

Problem  7.  Show  that  there  are  no  points  of  T  on  the  plane  Re,  +  Re2  other  than  integral 
linear  combinations  of  e,  and  e2. 

Hint.  Partition  the  plane  into  parallelograms  (Figure  216)  A  =  {2,e,  +  A2e2}, 
m:  <  /,■  <  4-  1.  If  there  were  an  e  e  A  with  e  #  m,e,  +  m2e2 ,  then  the  point  e  —  m,e,  —  m2  e2 

would  be  closer  to  Re,  than  e2 . 


Figure  216  Problem  7 


If  there  are  no  points  of  T  outside  the  plane  Re!  +  Re2,  the  lemma  is 
proved.  Suppose  that  there  is  a  point  eeT  outside  this  plane.  Then  there  exists 
a  point  e3  e  T  closest  to  Ret  +  Re2;  the  points  m^!  +  m2e2  +  m3e3 
exhaust  T  in  the  three-dimensional  space  Ret  +  Re2  +  Re3.  If  T  is  not 
exhausted  by  these,  we  take  the  closest  point  to  this  three-dimensional 
space,  etc. 

Problem  8.  Show  that  this  closest  point  always  exists. 

Hint.  Take  the  closest  of  the  finite  number  of  points  in  a  “cylinder”  C. 


Note  that  the  vectors  ei,  e2 ,  e3 , . . .  are  linearly  independent.  Since  they  all 
lie  in  R",  there  are  k  <  n  of  them. 
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Problem  9.  Show  that  f  is  exhausted  by  the  integral  linear  combinations  of  e, . ek. 

Him.  Partition  the  plane  Re,  +  ■  ■  4-  Rek  into  parallelepipeds  A  and  show  that  there  cannot 
be  a  point  of  T  in  any  A.  If  there  is  an  e  e  T  outside  the  plane  Re,  +  ■  •  ■  +  Res,  the  construction 
is  not  finished. 

Thus  Lemma  3  is  proved.  □ 

It  is  now  easy  to  prove  Lemma  2 :  Mf  is  diffeomorphic  to  a  torus  Tn. 
Consider  the  direct  product  of  k  circles  and  n  —  k  straight  lines : 

Tk  x  lRn-k  =  {(<?!, . . . ,  cpk ;  yu  . . . ,  <p  mod  2ti, 

together  with  the  natural  map  p :  R2"  -*■  Tk  x  R"~k, 

Pi*- P,  y)  =  (9  mod  2tt,  y). 

The  points  fl5 . . . ,  fk  £  R"  (f,  has  coordinates  q>t  =  2n,  <pj  =  0,  y  =  0)  are 
mapped  to  0  under  this  map. 

Let  ek  £  T  c:  [Rn  be  the  generators  of  the  group  T  (cf.  Lemma  3). 

We  map  the  vector  space  IR"  =  {(tp,  y)}  onto  the  space  IR"  =  {t}  so  that  the 
vectors  f,  go  to  e,.  Let  A:  IR"  -*■  IR"  be  such  an  isomorphism. 

We  now  note  that  IR"  =  {(<p,  y)}  gives  charts  for  Tk  x  Rn"k,  and  R"  =  {t} 
gives  charts  for  our  manifold  Mf . 


Problem  10.  Show  that  the  map  of  charts  A :  M"  -*■  R"  gives  a  diffeomorphism 
A:  T‘  x  R"-'1  -  Mf, 


*"=  {«p,  y)} 


Tk  x  Mn~k 


->  R"  =  {t{ 
y 
M, 


But,  since  the  manifold  Mf  is  compact  by  hypothesis,  k  =  n  and  M,  is  an 
n-dimensional  torus.  Lemma  2  is  proved.  D 


In  view  of  Lemma  1,  the  first  two  statements  of  the  theorem  are  proved. 
At  the  same  time,  we  have  constructed  angular  coordinates  (px, . . . ,  (pn  mod  2 n 
on  M{. 

Problem  II.  Show  that  under  the  action  of  the  phase  flow  with  hamiltonian  H  the  angular 
coordinates  <p  vary  uniformly  with  time 

(p,  =  i»j  oj:  =  (00)  <p(t)  =  qKO)  +  tot. 

In  other  words,  motion  on  the  invariant  torus  Mf  is  conditionally  periodic. 

Him.  tp  =  A ' 't. 

Of  all  the  assertions  of  the  theorem,  only  the  last  remains  to  be  proved: 
that  the  system  can  be  integrated  by  quadratures. 
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50  Action-angle  variables 

We  show  here  that,  under  the  hypotheses  of  Liouville’s  theorem,  we  can  find  symplectic  co¬ 
ordinates  (I,  <p)  such  that  the  first  integrals  F  depend  only  on  I,  and  cp  are  angular  coordinates 
on  the  torus  Mf. 


A  Description  of  action-angle  variables 

In  Section  49  we  studied  one  particular  compact  connected  level  manifold 
of  the  integrals:  Mf  =  {x:F(x)  =  f};  it  turned  out  that  Mr  was  an  n-di- 
mensional  torus,  invariant  with  respect  to  the  phase  flow.  We  chose  angular 
coordinates  <p{  on  M  so  that  the  phase  flow  with  hamiltonian  function  H  —  F^ 
takes  an  especially  simple  form : 

^  =  co(f)  cp(r)  =  cp(0)  +  m. 

We  will  now  look  at  a  neighborhood  of  the  n-dimensional  manifold  M, 
in  2n-dimensional  phase  space. 


Problem.  Show  that  the  manifold  Mr  has  a  neighborhood  diffeomorphic  to  the  direct  product 
of  the  n-dimensional  torus  T"  and  the  disc  D "  in  n-dimensional  euclidean  space. 

Hint.  Take  the  functions  F,  and  the  angles  <p;  constructed  above  as  coordinates.  In  view  of 
the  linear  independence  of  the  dFh  the  functions  F,  and  </>,■(/=  1, ....  n)  give  a  diffeomorphism 
of  a  neighborhood  of  Mf  onto  the  direct  product  Tn  x  Dn. 


In  the  coordinates  (F,  cp)  the  phase  flow  with  hamiltonian  function  H  =  F1 
can  be  written  in  the  form  of  the  simple  system  of  2 n  ordinary  differential 
equations 


(1) 


dt 


which  is  easily  integrated :  F(f)  =  F(0),  cp(r)  =  cp(0)  -I-  co(F(0))t. 

Thus,  in  order  to  integrate  explicitly  the  original  canonical  system  of 
differential  equations,  it  is  sufficient  to  find  the  variables  cp  in  explicit  form. 
It  turns  out  that  this  can  be  done  using  only  quadratures.  A  construction  of 
the  variables  cp  is  given  below. 

We  note  that  the  variables  (F,  cp)  are  not,  in  general,  symplectic  co¬ 
ordinates.  It  turns  out  that  there  are  functions  of  F,  which  we  will  denote 
by  I  =  1(F),  I  =  (Iu  . . . ,  /„),  such  that  the  variables  (I,  cp)  are  symplectic 
coordinates:  the  original  symplectic  structure  to2  is  expressed  in  them  by 
the  usual  formula 


co2  =  Y,  dh  a  dept . 
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The  variables  I  are  called  action  variables;88  together  with  the  angle  variables 
<p  they  form  the  action-angle  system  of  canonical  coordinates  in  a  neighbor¬ 
hood  of  Mf. 

The  quantities  f  are  first  integrals  of  the  system  with  hamiltonian  function 
H  =  Ft,  since  they  are  functions  of  the  first  integrals  Fj.  In  turn,  the  variables 
Fi  can  be  expressed  in  terms  of  I  and,  in  particular,  H  =  Fx  =  H(I).  In 
action-angle  variables  the  differential  equations  of  our  flow  (1)  have  the  form 

dl  n  dtp 

(2)  -  =  °  -f-  =  <a(I). 

dt  dt 

Problem.  Can  the  functions  g>(I)  in  (2)  be  arbitrary? 

Solution.  In  the  variables  (I.  cp),  the  equations  of  the  flow  (2)  have  the  canonical  form  with 
hamiltonian  function  H(I).  Therefore,  w(I)  =  dH/8 1;  thus  if  the  number  of  degrees  of  freedom 
is  n  >  2,  the  functions  co(I)  are  not  arbitrary,  but  satisfy  the  symmetry  condition  dtoJSIj  = 
dtoj  dl j. 

Action-angle  variables  are  especially  important  for  perturbation  theory ; 
in  Section  52  we  will  demonstrate  their  application  to  the  theory  of  adiabatic 
invariants. 

B  Construction  of  action-angle  variables  in  the 
case  of  one  degree  of  freedom 

A  system  with  one  degree  of  freedom  in  the  phase  plane  ( p ,  q )  is  given  by  the 
hamiltonian  function  H(p,  q). 

Example  1.  The  harmonic  oscillator  H  =  ?p2  +  jq2',  or,  more  generally, 
H  =  ja2p2  +  \b2q2. 

Example  2.  The  mathematical  pendulum  H  =  ^p2  —  cos  q.  In  both  cases 
we  have  a  compact  closed  curve  Mh(H  =  h ),  and  the  conditions  of  the 
theorem  of  Section  49  for  n  =  1  are  satisfied. 

In  order  to  construct  the  action-angle  variables,  we  will  look  for  a 
canonical  transformation  (p,  q)  -*■  (/,  (p)  satisfying  the  two  conditions: 

1 .  i  =  m , 

(3)  , 

2.  (p  dtp  =  In. 

Problem.  Find  the  action-angle  variables  in  the  case  of  the  simple  harmonic  oscillator 
H  =  jP2  +  W- 

Solution.  If  r.  <p  are  polar  coordinates,  then  dp  a  dq  =  r  dr  a  dtp  =  d(r2/ 2)  a  dtp.  There¬ 
fore,  /  =  H  =  {p2  +  q2)/ 2. 


88  It  is  not  hard  to  see  that  I  has  the  dimensions  of  action. 
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In  order  to  construct  the  canonical  transformation  p,q-+  I,q>  in  the 
general  case,  we  will  look  for  its  generating  function  S(I,  q) : 

dS(I,q )  dS(I,q )  TJ(dS(l,q)  \ 

<4)  p — sT  (p  =  ~ei~  = 

We  first  assume  that  the  function  h(l )  is  known  and  invertible,  so  that  every 
curve  Mh  is  determined  by  the  value  of  /  ( Mh  =  Mh(I)).  Then  for  a  fixed 
value  of  /  we  have  from  (4) 

^^l/  =  const  P  dq. 

This  relation  determines  a  well-defined  differential  1-form  dS  on  the  curve 

Integrating  this  1-form  on  the  curve  Mh(I)  we  obtain  (in  a  neighborhood 
of  a  point  q0 )  a  function 


S(I,  q)  =  f  P  dq. 

This  function  will  be  the  generating  function  of  the  transformation  (4)  in 
a  neighborhood  of  the  point  (I,  q0).  The  first  of  the  conditions  (3)  is  satisfied 
automatically:  /  =  1(h).  To  verify  the  second  condition,  we  consider  the 
behavior  of  S(I,  q)  “in  the  large.”  After  a  circuit  of  the  closed  curve  Mh{1)  the 
integral  of  p  dq  increases  by 


A S(I)  =  cb  p  dq, 

equal  to  the  area  n  enclosed  by  the  curve  Mh(I).  Therefore,  the  function  S 
is  a  “multiple- valued  function”  on  Mh{1):  it  is  determined  up  to  addition 
of  integral  multiples  of  n.  This  term  has  no  effect  on  the  derivative  dS(I,  q)/dq; 
but  it  leads  to  the  multi-valuedness  of  (p  —  dS/dl.  This  derivative  turns  out 
to  be  defined  only  up  to  multiples  of  d  A S(I)/dI.  More  precisely,  the  formulas 
(4)  define  a  1-form  dcp  on  the  curve  Mh(I),  and  the  integral  of  this  form  on 
Mk(I)  is  equal  to  d  A S(I)/dI. 

In  order  to  fulfill  the  second  condition,  f  Mh  d<p  =  2n,  we  need  that 


^  AS(/)  =  2k 
dl 


AS  II 

2n  2ti 


where  II  =  $  Mh  p  dq  is  the  area  bounded  by  the  phase  curve  H  =  h. 


Definition.  The  action  variable  in  the  one-dimensional  problem  with 
hamiltonian  function  H(p,  q)  is  the  quantity  1(h)  =  (l/27r)II(h). 


Finally,  we  arrive  at  the  following  conclusion.  Let  dll/dh  #  0.  Then  the 
inverse  1(h)  of  the  function  h(I)  is  defined. 
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Theorem.  Set  S(I,q )  =  pdq\H=h(I).  Then  formulas  (4)  give  a  canonical 
transformation  p,q-+I,(p  satisfying  conditions  (3). 

Thus,  the  action-angle  variables  in  the  one-dimensional  case  are  con¬ 
structed. 

Problem.  Find  S  and  /  for  a  harmonic  oscillator. 

Answer.  If  H  =  ja2p2  +  jb2q2  (Figure  217),  then  M,,  is  the  ellipse  bounding  the 
area  FI(h)  =  n(f2h/a)(f2h/b)  =  Inh/ab  =  Inhjio.  Thus  for  a  harmonic  oscillator  the  action 
variable  is  the  ratio  of  energy  to  frequency.  The  angle  variable  ip  is,  of  course,  the  phase  of 
oscillation. 


P 


Figure  217  Action  variable  for  a  hamonic  oscillator 

Problem.  Show  that  the  period  T  of  motion  along  the  closed  curve  H  =  h  on  the  phase  plane 
p,  q  is  equal  to  the  derivative  with  respect  to  h  of  the  area  bounded  by  this  curve: 

dm) 

dh  ' 

Solution.  In  action-angle  variables  the  equations  of  motion  (2)  give 

/dn\~l  2k  dU 

\dh  /  <p  dh 


v  dl  \dh 


C  Construction  of  action-angle  variables  in  K2" 

We  turn  now  to  systems  with  n  degrees  of  freedom  given  in  R2"  =  {(p,  q)} 
by  a  hamiltonian  function  H( p,  q)  and  having  n  first  integrals  in  involution 
Fx  =  H,  F2,  •  •  • ,  F„.  We  will  not  repeat  the  reasoning  which  brought  us  to 
the  choice  of  2nl  =  |  p  dq  in  the  one-dimensional  case,  but  will  immediately 
define  n  action  variables  I. 

Let  yu  . . . ,  yn  be  a  basis  for  the  one-dimensional  cycles  on  the  torus  Mf 
(the  increase  of  the  coordinate  (p,  on  the  cycle  y}  is  equal  to  2n  if  i  =  j  and 
0  if  i  #  j).  We  set 

(5)  /,(f)  =  ^  |  P  dq. 
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Figure  218  Independence  of  the  curve  of  integration  for  the  action  variable 


Problem.  Show  that  this  integral  does  not  depend  on  the  choice  of  the  curve  y,  representing 
the  cycle  (Figure  218). 

Hint.  In  Section  49  we  showed  that  the  2-form  to2  =  ,  a  dq,  on  the  manifold  Mt  is 

equal. to  zero.  By  Stokes'  formula, 


where  Pa 


dp  a  dq  =  0, 


Definition.  The  n  quantities  /,(f)  given  by  formula  (5)  are  called  the  action 
variables. 

We  assume  now  that,  for  the  given  values  f  of  the  n  integrals  Ft,  the  n 
quantities  It  are  independent:  det(dl/df)|f  ^  0.  Then  in  a  neighborhood 
of  the  torus  Mf  we  can  take  the  variables  I,  <p  as  coordinates. 

Theorem.  The  transformation  p,  q  -*•  I,  q>  is  canonical,  i.e., 

£  dpi  a  dqt  =  £  df  a  d(Pi . 

We  outline  the  proof  of  this  theorem.  Consider  the  differential  1-form 
pdq  on  Mf.  Since  the  manifold  Mf  is  null  (Section  49)  this  1-form  on  Mf 
is  closed:  its  exterior  derivative  a>2  =  dp  a  dq  is  identically  equal  to  zero 
on  Mt.  Therefore  (Figure  219), 


p 


Figure  219  Independence  of  the  path  for  the  integral  of  p  dq  on  M f 
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does  not  change  under  deformations  of  the  path  of  integration  (Stokes’ 
formula).  Thus  S(x)  is  a  “multiple- valued  function”  on  Mf,  with  periods 
equal  to 

AjS  —  f  dS  =  27i/,. 

Now  let  x0  be  a  point  on  Mf ,  in  a  neighborhood  of  which  the  n  variables 
q  are  coordinates  on  Mf,  such  that  the  submanifold  Mt  c  IR2"  is  given  by  n 
equations  of  the  form  p  =  p(I,  q),  q(x0)  =  q0 .  In  a  simply  connected  neighbor¬ 
hood  of  the  point  q0  a  single- valued  function  is  defined, 

S(l  q)  =  f  P(I,  q)dq, 

Ao 


and  we  can  use  it  as  the  generating  function  of  a  canonical  transformation 

P,  q  -*■  I,  <p: 

8S  3S 


It  is  not  difficult  to  verify  that  these  formulas  actually  give  a  canonical 
transformation,  not  only  in  a  neighborhood  of  the  point  under  consideration, 
but  also  “in  the  large”  in  a  neighborhood  of  Mf .  The  coordinates  q>  will  be 
multiple-valued  with  periods 


A  A  55  8  A  V 

A“?’  =  A'g] -  =  A's  = 


—  2.K  I  i  =  Indij, 


as  was  to  be  shown. 


□ 


We  now  note  that  all  our  constructions  involve  only  “algebraic” 
operations  (inverting  functions)  and  “quadrature”— calculation  of  the 
integrals  of  known  functions.  In  this  way  the  problem  of  integrating  a 
canonical  system  with  2 n  equations,  of  which  n  first  integrals  in  involution 
are  known,  is  solved  by  quadratures,  which  proves  the  last  assertion  of 
Liouville’s  theorem  (Section  49).  □ 

Remark  1.  Even  in  the  one-dimensional  case  the  action-angle  variables 
are  not  uniquely  defined  by  the  conditions  (3).  We  could  have  taken 
V  =  /  4-  const  for  the  action  variable  and  q>'  =  cp  +  c(/)  for  the  angle 
variable. 


Remark  2.  We  constructed  action-angle  variables  for  systems  with  phase 
space  IR2".  We  could  also  have  introduced  action-angle  variables  for  a  system 
on  an  arbitrary  symplectic  manifold.  We  restrict  outselves  here  to  one  simple 
example  (Figure  220). 
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Figure  220  Action-angle  variables  on  a  symplectic  manifold 

We  could  have  taken  the  phase  space  of  a  pendulum  ( H  =  jp2  —  cos  q ) 
to  be,  instead  of  the  plane  {(p,  <?)},  the  surface  of  the  cylinder  IR1  x  S 1 
obtained  by  identifying  angles  q  differing  by  an  integral  multiple  of  2n. 

The  critical  level  lines  H  =  ±  1  divide  the  cylinder  into  three  parts, 
A,  B,  and  C,  each  of  which  is  diffeomorphic  to  the  direct  product  IR1  x  S1. 
We  can  introduce  action-angle  variables  into  each  part.  In  the  bounded  part 
( B )  the  closed  trajectories  represent  the  oscillation  of  the  pendulum;  in 
the  unbounded  parts  they  represent  rotation. 


Remark  3.  In  the  general  case,  as  in  the  example  analyzed  above,  the 
equations  F,  =  cease  to  be  independent  for  some  values  of  j\ ,  and  Mf  ceases 
to  be  a  manifold.  Such  critical  values  of  f  correspond  to  separatrices  dividing 
the  phase  space  of  the  integrable  problem  into  parts  corresponding  to  the 
parts  A,  B,  and  C  above.  In  some  of  these  parts  the  manifolds  Mf  can  be 
unbounded  (parts  A  and  C  in  the  plane  {(p,  q )});  others  are  stratified  into 
n-dimensional  invariant  tori  Mf;  in  a  neighborhood  of  such  a  torus  we 
can  introduce  action-angle  variables. 


51  Averaging 

In  this  paragraph  we  show  that  time  averages  and  space  averages  are  equal  for  systems  under¬ 
going  conditionally-periodic  motion. 


A  Conditionally-periodic  motion 

In  the  earlier  sections  of  this  book,  we  have  frequently  encountered  con¬ 
ditionally-periodic  motion:  Lissajous  figures,  precession,  nutation,  rotation 
of  a  top,  etc. 

Definition.  Let  Tn  be  the  n-dimensional  torus  and  q>  =  (<pl5 . . . ,  <p„)  mod  2 n 
angular  coordinates.  Then  by  a  conditionally-periodic  motion  we  mean  a 
one-parameter  group  of  diffeomorphisms  T"  -*  Tn  given  by  the  dif¬ 
ferential  equations  (Figure  221): 

tp  =  to.  to  =  (cuj, . . . ,  oj„)  —  const. 
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*2 


Figure  221  Conditionally-periodic  motion 

These  differential  equations  are  easily  integrated : 

<p(t)  =  <p(0)  +  cot. 

Thus  the  trajectories  in  the  chart  {<p}  are  straight  lines.  A  trajectory  on  the 
torus  is  called  a  winding  of  the  torus. 

Example.  Let  n  =  2.  If  <oxho2  =  kt/k2,  the  trajectories  are  closed;  if  «,/co2  is  irrational,  then 
trajectories  on  the  torus  are  dense  (cf.  Section  16). 

The  quantities  <o1, . . . ,  oo„  are  called  the  frequencies  of  the  conditionally- 
periodic  motion.  The  frequencies  are  called  independent  if  they  are  linearly 
independent  over  the  field  of  rational  numbers:  if  k  e  Z" 89  and  (k,  to)  =  0, 
then  k  =  0. 

B  Space  average  and  time  average 

Let  / (q>)  be  an  integrable  function  on  the  torus  T”. 

Definition.  The  space  average  of  a  function  /  on  the  torus  Tn  is  the  number 

f=(2n)~"  p  -.  f’fWvt  -  d?,. 

Jo  j  0 

Consider  the  value  of  the  function  / (q>)  on  the  trajectory  cp(r)  =  cp0  +  cot. 
This  is  a  function  of  time,  / (cp0  +  tot).  We  consider  its  average. 

Definition.  The  time  average  of  the  function  /  on  the  torus  T "  is  the  function 

1  rT 

f*(tp0)  =  lim  -  /(q> o  +  tot)dt 

T-*  oo  ^  *0 

(defined  where  the  limit  exists). 

Theorem  on  the  averages-  The  time  average  exists  everywhere,  and  coincides 
with  the  space  average  if  f  is  continuous  (or  merely  Riemann  integrable ) 
and  the  frequencies  co,  are  independent. 

84  k  —  (kt, . . . ,  kn)  with  integral  k{. 
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Problem.  Show  that  if  the  frequencies  are  dependent,  then  the  time  average  can  differ  from  the 
space  average. 

Corollary  1.  If  the  frequencies  are  independent,  then  every  trajectory  {(p(f)} 
is  dense  on  the  torus  T". 

Proof.  Assume  the  contrary.  Then  in  some  neighborhood  D  of  some  point 
of  the  torus,  there  is  no  point  of  the  trajectory  *p(r).  It  is  easy  to  construct  a 
continuous  function  /  equal  to  zero  outside  D  and  with  space  average  equal 
to  1.  The  time  average  /*(q>0)  on  the  trajectory  qt(t)  is  equal  to  0  /  1. 
This  contradicts  the  assertion  of  the  theorem.  □ 


Corollary  2.  If  the  frequencies  are  independent,  then  every  trajectory  is 
uniformly  distributed  on  the  torus  Tn. 


This  means  that  the  time  the  trajectory  spends  in  a  neighborhood  D  is 
proportional  to  the  measure  of  D. 

More  precisely,  let  D  be  a  (Jordan)  measurable  region  of  Tn.  We  denote 
by  t d(T)  the  amount  of  time  that  the  interval  0  <  t  <  T  of  the  trajectory 
<p(t)  is  inside  of  D.  Then 


lim 

T  -*  oo 


tP(T) 

T 


mes  D 
(2 *)"  ' 


Proof.  We  apply  the  theorem  to  the  characteristic  function  /  of  the  set  D 
(/  is  Riemann  integrable  since  D  is  Jordan  measurable).  Then  jj  /(cp(r))df  = 
td(T),  and  /  =  (2n)~n  mes  D,  and  the  corollary  follows  immediajely  from 
the  theorem.  □ 


Corollary.  In  the  sequence 

1,2,  4,8,  1,3,  6,  1,2,  5,  1,2, ... 

of  first  digits  of  the  numbers  2",  the  number  7  appears  (log  8  —  log  7)/(log  9  —  log  8)  times  as 
often  as  8. 

The  theorem  on  averages  may  be  found  implicitly  in  the  work  of  Laplace, 
Lagrange,  and  Gauss  on  celestial  mechanics;  it  is  one  of  the  first  “ergodic 
theorems.”  A  rigorous  proof  was  given  only  in  1909  by  P.  Bohl,  W.  Sierpinski, 
and  H.  Weyl  in  connection  with  a  problem  of  Lagrange  on  the  mean  motion 
of  the  earth’s  perihelion.  Below  we  reproduce  H.  Weyl’s  proof. 

C  Proof  of  the  theorem  on  averages 

Lemma  1.  The  theorem  is  true  for  exponentials  f  =  e‘,k•,|,,,  k  e  Zn. 

Proof.  Ifk  =  0,  then  /  =  /  =  /*  =  l  and  the  theorem  is  obvious.  If  k  ^  0, 
then  /  -  0.  On  the  other  hand, 

pT  J(k,to)T  _  1 

dt  =  ei(k’V0)  — - — . 

Jo  °>) 
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Therefore,  the  time  average  is 

£i'(k.  <po)  m)T  _  2 

lim - 

T-*ao  T 


=  0. 


□ 


Lemma  2.  The  theorem  is  i  rue  for  trigonometric  polynomials 


f=  I  A^kv). 

|k|  <N 


Proof.  Both  the  time  and  space  averages  depend  linearly  on  f,  and  therefore 
agree  by  Lemma  1 .  □ 


Lemma  3.  Let  f  be  a  real  continuous  (or  at  least  Riemann  integrable)  function. 
Then ,  for  any  e  >  0,  there  exist  two  trigonometric  polynomials  P,  and  P2 
such  that  Pl  <  f  <  P2  and  (I/(27r)")  jT„(P2  —  Pj)d< p  <  s. 

Proof.  Suppose  first  that  /  is  continuous.  By  the  Weierstrass  theorem,  we 
can  approximate  /  by  a  trigonometric  polynomial  P  with  \f  —  P\  < 
The  polynomials  Px  =  P  —  j£  and  P2  =  P  +  je  are  the  ones  we  are  looking 
for. 

If  /  is  not  continuous  but  Riemann  integrable,  then  there  are  two  continu¬ 
ous  functions /,  and  f2  such  that  fx  <  f  <  f2  and  (27r)_n  J  ( f2  —  fi)d(p  < 
(Figure  222  corresponds  to  the  characteristic  function  of  an  interval). 
By  approximating  /,  and  f2  by  polynomials  Pi<fi<f2<  P2, 
(2n)  "  J  (P2  —  f2)d*f>  <  (2n)  "  J  (/i  —  Pi)dq>  <  we  obtain  what  we 

need.  Lemma  3  is  proved.  □ 


Figure  222  Approximation  of  the  function  /  by  trigonometric  polynomials  Px  and  P2 


It  is  now  easy  to  finish  the  proof  of  the  theorem.  Let  s  >  0.  Then, 
by  Lemma  3,  there  are  trigonometric  polynomials  P,  <  f  <  P2  with 
(27i)-"f  (P2  -P^dqxe. 

For  any  T,  we  then  have 


1 

T 


f 


Pi(tp(0)<if 


< 


y  J  f(<P(t))dt  <  i  J  P2(<p(f))df. 


By  Lemma  2,  for  T  >  T0(c), 


p,  -  y  f  f,<<p (W 


<  e 


a  =  i,  2). 
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Furthermore,  P{  <f<P2  and  Pi  ~  Pi  <  £-  Therefore,  P2  —  f  <  £  and 
f  -  Pi  <  £;  therefore,  for  T  >  T0(e), 


as  was  to  be  proved. 


Y  CnvdM  -  f 


<  2s, 


□ 


Problem,  A  two-dimensional  oscillator  with  kinetic  energy  T  =  j x 1  +  \y2  and  potential 
energy  U  =  jx2  +  y2  performs  an  oscillation  with  amplitudes  ax  =  1  and  ay  -  1.  Find  the 
time  average  of  the  kinetic  energy. 

Problem,90  Let  cak  be  independent,  ak  >  0.  Calculate 

1  3 

lim  -  arg  £  akeiv"''. 

f-»  30  t  1 

Answer.  (co,3j  +  w2a2  +  a>3x3)/7i,  where  i„  a2,  and  a3  are  the  angles  of  the  triangle  with 
sides  ak  (Figure  223), 


Figure  223  Problem  on  mean  motion  of  perihelia 


D  Degeneracies 

So  far  we  have  considered  the  case  when  the  frequencies  to  are  independent. 
An  integral  vector  k  6  Z"  is  called  a  relation  among  the  frequencies  if 
(k,  to)  =  0. 


Problem.  Show  that  the  set  of  all  relations  between  a  given  set  of  frequencies  o>  is  a  subgroup 
T  of  the  lattice  Z". 

We  saw  in  Section  49  that  such  a  subgroup  consists  entirely  of  linear 
combinations  of  r  independent  vectors  kf,  1  <  r  <  n.  We  say  that  there  are 
r  ( independent )  relations  among  the  frequencies.91 

90  Lagrange  showed  that  the  investigatiqn  of  the  average  motion  of  the  perihelion  of  a  planet 
reduces  to  a  similar  problem.  The  solution  of  this  problem  can  be  found  in  the  work  of  H.  Weyl, 
The  eccentricity  of  the  earth’s  orbit  varies  as  the  modulus  of  an  analogous  sum.  Ice  ages  appear 
to  be  related  to  these  changes  in  eccentricity. 

1,1  Show  that  the  number  r  does  not  depend  on  the  choice  of  independent  vectors  k,  . 
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Problem.  Show  that  the  closure  of  a  trajectory  }<p(r)  =  <p0  +  tot}  (on  T")  is  a  torus  of  dimen¬ 
sion  n  —  r  if  there  are  r  independent  relations  among  the  frequencies  to;  in  this  case  the  motion 
on  V~'  is  conditionally-periodic  with  n  —  r  independent  frequencies. 

We  turn  now  to  the  integrable  hamiltonian  system  given  in  action-angle 
variables  I,  <p  by  the  equations 

d  H 

I  =  0  tj)  =  cd(I),  where  o(I)  =  — . 

cl 

Every  n-dimensional  torus  I  =  const  in  the  2n-dimensional  phase  space  is 
invariant,  and  motion  on  it  is  conditionally-periodic. 


Definition.  A  system  is  called  nondegenerate  if  the  determinant 


(tO 

det  —  =  det 

cl 


82H 


is  not  zero. 

Problem.  Show  that,  if  a  system  is  nondegenerate,  then  in  any  neighborhood  of  any  point  there 
is  a  conditionally-periodic  motion  with  n  frequencies,  and  also  with  any  smaller  number  of 
frequencies. 

Hint.  We  can  take  the  frequencies  to  themselves  instead  of  the  variables  I  as  local  coordinates. 
In  the  space  of  collections  of  frequencies,  the  set  of  points  to  with  any  number  of  relations 
r(0  <  r  <  n )  is  dense. 


Corollary.  If  a  system  is  nondegenerate,  then  the  invariant  tori  I  =  const 
are  uniquely  defined,  independent  of  the  choice  of  action-angle  coordinates 
I,  q>,  the  construction  of  which  always  involves  some  arbitrariness.92 

Proof.  The  tori  I  =  const  can  be  defined  as  the  closures  of  the  phase  tra¬ 
jectories  corresponding  to  the  independent  to.  □ 

We  note  incidentally  that,  for  the  majority  of  values  I,  the  frequencies 
to  will  be  independent. 

Problem.  Show  that  the  set  of  I  for  which  the  frequencies  <o(I)  in  a  nondegenerate  system  are 
dependent  has  Lebesgue  measure  equal  to  zero. 

Hint.  Show  first  that 


mes  {to:  3k  #  0,  (w,  k)  =  0}  =  0. 

On  the  other  hand,  in  degenerate  systems  we  can  construct  systems  of 
action-angle  variables  such  that  the  tori  I  =  const  will  be  different  in  dif¬ 
ferent  systems.  This  is  the  case  because  the  closures  of  trajectories  in  a 
degenerate  system  are  tori  of  dimension  k  <  n,  and  they  can  be  contained 
in  different  ways  in  n-dimensional  tori. 

92  For  example,  we  can  always  write  the  substitution  I'  =  I,  cp'  =  cp  +  S,(I).  or  / 1 ,  /2 : 
</>i,  <f>2  -*  I,  +  12 .  Ii ■  <Pi,  Vi  ~  (P i- 
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Example  1.  The  planar  harmonic  oscillator  x  =  —x;n  =  2,k=  1.  Separa¬ 
tion  of  variables  in  cartesian  and  polar  coordinates  leads  to  different  action- 
angle  variables  and  different  tori. 

Example  2.  Keplerian  planar  motion  (U  =  —  1/r),  n  =  2,  k  =  1.  Here, 
too,  separation  of  variables  in  polar  and  in  elliptical  coordinates  leads  to 
different  I. 

52  Averaging  of  perturbations 

Here  we  show  the  adiabatic  invariance  of  the  action  variable  in  a  system  with  one  degree  of 
freedom. 

A  Systems  close  to  integrable  ones 

We  have  considered  a  great  many  integrable  systems  (one-dimensional 
problems,  the  two-body  problem,  small  oscillations,  the  Euler  and  Lagrange 
cases  of  the  motion  of  a  rigid  body  with  a  fixed  point,  etc.).  We  studied  the 
characteristics  of  phase  trajectories  in  these  systems:  they  turned  out  to  be 
“windings  of  tori,”  densely  filling  up  the  invariant  tori  in  phase  space;  every 
trajectory  is  uniformly  distributed  on  this  torus. 

One  should  not  conclude  from  this  that  integrability  is  the  typical 
situation.  Actually,  the  properties  of  trajectories  in  many-dimensional 
systems  can  be  highly  diverse  and  not  at  all  similar  to  the  properties  of 
conditionally-periodic  motions.  In  particular,  the  closure  of  a  trajectory 
of  a  system  with  n  degrees  of  freedom  can  fill  up  complicated  sets  of  dimension 
greater  than  n  in  2n-dimensional  phase  space;  a  trajectory  could  even  be 
dense  and  uniformly  distributed  on  a  whole  (2n  —  l)-dimensional  manifold 
given  by  the  equation  H  =  h.93  One  may  call  such  systems  “nonintegrable” 
since  they  do  not  admit  single-valued  first  integrals  independent  of  H. 
The  study  of  such  systems  is  still  far  from  complete;  it  constitutes  a  problem 
in  “ergodic  theory.” 

One  approach  to  nonintegrable  systems  is  to  study  systems  which  are 
close  to  integrable  ones.  For  example,  the  problem  of  the  motion  of  planets 
around  the  sun  is  close  to  the  integrable  problem  of  the  motion  of  non¬ 
interacting  points  around  a  stationary  center;  other  examples  are  the  prob¬ 
lem  of  the  motion  of  a  slightly  nonsymmetric  heavy  top  and  the  problem  of 
nonlinear  oscillations  close  to  an  equilibrium  position  (the  nearby  integrable 
problem  is  linear).  The  following  method  is  especially  fruitful  in  the  in¬ 
vestigation  of  these  and  similar  problems. 

B  The  averaging  principle 

Let  I,  «p  be  action-angle  variables  in  an  integrable  (“nonperturbed”)  system 
with  hamiltonian  function  //0(I): 

8  H 

I  =  0  (p  =  co(I)  co(I)  = 

93  For  example,  inertial  motion  on  a  manifold  of  negative  curvature  has  this  property. 


291 


10:  Introduction  to  perturbation  theory 


As  the  nearby  “perturbed”  system  we  take  the  system 

(1)  <j>  =  (o(I)  +  ef(I,  cp)  i  =  fig(I,  cp), 
where  e  1. 

We  will  ignore  for  a  while  that  the  system  is  hamiltonian  and  consider 
an  arbitrary  system  of  differential  equations  in  the  form  (1)  given  on  the  direct 
product  Tk  x  G  of  the  k -dimensional  torus  Tk  =  {cp  =  (<pls . . . ,  (pk)  mod  2ft} 
and  a  region  G  in  /-dimensional  space  G  <=  R'  =  {I  =  (/ 1, . . . ,  /;)}.  For 
e  =  0  the  motion  in  (1)  is  conditionally-periodic  with  at  most  k  frequencies 
and  with  /c-dimensional  invariant  tori. 

The  averaging  principle  for  system  (1)  consists  of  its  replacement  by 
another  system,  called  the  averaged  system: 

/*2>r  /*2?t 

(2)  J  =  fig(J)  g(J)  =  (2n)-k  •  •  •  g(J,  (p]d(pi_ , d(pk 

Jo  Jo 

in  the  /-dimensional  region  GcRl  =  {J  =  (Ji,...,  Jt)}. 

We  claim  that  system  (2)  is  a  “good  approximation”  to  system  (1). 

We  note  that  this  principle  is  neither  a  theorem,  an  axiom,  nor  a  definition, 
but  rather  a  physical  proposition,  i.e.,  a  vaguely  formulated  and,  strictly 
speaking,  untrue  assertion.  Such  assertions  are  often  fruitful  sources  of 
mathematical  theorems. 

This  averaging  principle  may  be  found  explicitly  in  the  work  of  Gauss 
(in  studying  the  perturbations  of  planets  on  one  another,  Gauss  proposed 
to  distribute  the  mass  of  each  planet  around  its  orbit  proportionally  to  time 
and  to  replace  the  attraction  of  each  planet  by  the  attraction  of  the  ring  so 
obtained).  Nevertheless,  a  satisfactory  description  of  the  connection  between 
the  solutions  of  systems  (1)  and  (2)  in  the  general  case  has  not  yet  been  found. 

In  replacing  system  (1)  by  system  (2)  we  discard  the  term  £g(I,  cp)  = 
£g(I,  cp)  —  gg(I)  on  the  right-hand  side.  This  term  has  order  e  as  does  the 
remaining  term  eg.  In  order  to  understand  the  different  roles  of  the  terms 
g  and  g  in  g,  we  consider  the  simplest  example. 

Problem  .  Consider  the  case  k  =  l  =  1, 

<j>  =  CO  ^  0  i  =  eg((p). 


Show  that  for  0  <  t  <  1/e, 

|  /(t)  -  7(f)  |  <  cfi,  where  7(f)  =  1(0)  +  egt. 


Solution 

I(t)  -  7(0)  =  f  cg((p0  +  wt)dt 
Jo 


pt  E  r ”  _  g 

=  \  eg  dt  H -  g((p)d<p  =  egt  H - h(a>t) 

Jo  ru  Jq  oj 


where  h(cp)  =  $g((p)d<p  is  a  periodic,  and  therefore  bounded,  function. 


h 
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/ 


Figure  224  Evolution  and  oscillation 

Thus  the  variation  in  /  with  time  consists  of  two  parts :  an  oscillation  of 
order  e  depending  on  g  and  a  systematic  “evolution”  with  velocity  eg 
(Figure  224). 

The  averaging  principle  is  based  on  the  assertion  that  in  the  general 
case  the  motion  of  system  (1)  can  be  divided  into  the  “evolution”  (2)  and 
small  oscillations.  In  its  general  form,  this  assertion  is  invalid  and  the  principle 
itself  is  untrue.  Nevertheless,  we  will  apply  the  principle  to  the  hamiltonian 
system  (1): 

3  3 

9  =  -  (tfo(I)  +  eH^l,  <p))  I  =  —  (tf0(I)  +  fitful,  cp)). 

For  the  right-hand  side  of  the  averaged  system  (2)  we  then  obtain 

g  =  (2*)-"  f  2-f/,( I,*>»9  =  0. 

Jo  dtp 

In  other  words,  there  is  no  evolution  in  a  nondegenerate  hamiltonian  system. 

One  variant  of  this  entirely  nonrigorous  deduction  leads  to  the  so- 
called  Laplace  theorem:  The  semi-major  axes  of  the  keplerian  ellipses  of 
the  planets  have  no  secular  perturbations. 

The  discussion  above  suffices  to  convince  us  of  the  importance  of  the 
averaging  principle;  we  now  formulate  a  theorem  justifying  this  principle 
in  one  very  particular  case— that  of  single-frequency  oscillations  (k  =  1). 
This  theorem  shows  that  the  averaging  principle  correctly  describes  evolution 
over  a  large  interval  of  time  (0  <  t  <  1  /e). 

C  Averaging  in  a  single-frequency  system 
Consider  the  system  of  l  +  1  differential  equations 

cp  =  a*I)  +  ef  (I,  <p)j  cp  mod  2n  e  S\ 

I  =  eg(I,  cp)  j  IeGcR', 

where  /(I,  cp  +  2n)  =  /(I,  cp)  and  g(I,  cp  +  2n)  =  g(I,  cp),  together  with  the 
“averaged”  system  of  l  equations 

(2)  j  =  £g(J),  where  g(J)  =  ~  j*  g(J,  cp)dcp. 
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Figure  225  Theorem  on  averaging 

We  denote  by  1(f),  <p{t)  the  solution  of  system  (1)  with  initial  conditions 
1(0),  tp( 0),  and  by  J(f)  the  solution  of  system  (2)  with  the  same  initial  con¬ 
ditions  J(0)  =  1(0)  (Figure  225). 

Theorem.  Suppose  that: 

1.  the  functions  to,  f,  and  g  are  defined  for  I  in  a  bounded  region  G,  and  in 
this  region  they  are  bounded,  together  with  their  derivatives  up  to  second 
order: 

II  f0)  />  Sllc2(G  x  S1)  <  C1  ; 

2.  in  the  region  G,  we  have 

cu(I)  >  c  >  0; 

3.  for  0  <  t  <  1/e,  a  neighborhood  of  radius  d  of  the  point  J(f)  belongs  to  G  : 

J(f)  e  G  -  d. 

Then  for  sufficiently  small  e  (0  <  e  <  e0) 

1 

1 1(f)  —  J(f)|  <  cqs,  for  all  t,  0  <  f  < 

& 

where  the  constant  cg  >  0  depends  on  cu  c,  and  d,  but  not  on  e. 

Some  applications  of  this  theorem  will  be  given  below  (“adiabatic  in¬ 
variants”).  We  remark  that  the  basic  idea  of  the  proof  of  this  theorem 
(a  change  of  variables  diminishing  the  perturbation)  is  more  important  than 
the  theorem  itself;  this  is  one  of  the  basic  ideas  in  the  theory  of  ordinary 
differential  equations;  it  is  encountered  in  elementary  courses  as  the  “method 
of  variation  of  constants.” 

D  Proof  of  the  theorem  on  averaging 

In  place  of  the  variables  I  we  will  introduce  new  variables  P 

(3)  P  =  I  +  ck(I,  tp), 

where  the  function  k,  27r-periodic  in  tp,  will  be  chosen  so  that  the  vector  P 
will  satisfy  a  simpler  differential  equation. 
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By  (1)  and  (3),  the  rate  of  change  of  P(r)  is 

(4)  P=  1  +  +  4^*  =  *[g(I,<J>)  + 

We  assume  that  the  substitution  (3)  can  be  inverted,  so  that 

(5)  I  =  P  +  ch(P,  cp ,  e) 


(where  the  functions  h  are  27t-periodic  in  cp). 

Then  (4)  and  (5)  imply  that  P(r)  satisfies  the  system  of  equations 


(6) 


5  £ 


5k 


P  =  fi|  g(P,  <P)  +  ~  CO(P) 


+  R, 


where  the  “remainder  term”  R  is  small  of  second  order  with  respect  to  e: 
(7)  lRl  <  C2£2,  C2{cu  C3,  C4)  >  0, 


if  only 


(8)  IMIcaCCj  \\f\\C2<C1  ||g||C2<Cl  ||k|[C2  <  c3  ||h||C2<C4. 

We  will  now  try  to  choose  the  change  of  variables  (3)  so  that  the  term 
involving  e  in  (6)  becomes  zero.  For  k  we  get  the  equation 

rk  _  _J_ 
dq>  a.)  ® ‘ 

In  general,  such  an  equation  is  not  solvable  in  the  class  of  functions  k 
periodic  in  <p .  In  fact,  the  average  value  (with  respect  to  cp)  of  the  left-hand  side 
is  always  equal  to  0,  and  the  average  value  of  the  right-hand  side  can  be 
different  from  0.  Therefore,  we  cannot  choose  k  in  such  a  way  as  to  kill  the 
entire  term  involving  e  in  (6).  However,  we  can  kill  the  entire  “periodic” 
part  of  g, 

g(P,  <p)  =  g(P,  <p)  -  g(P), 

by  setting 


(9) 


k(P,  cp)  = 


*  g(P,  <P) 
0  <o(P) 


dcp. 


So  we  define  the  function  k  by  formula  (9).  Then,  by  hypotheses  1.  and 
2.  of  the  theorem,  the  function  k  satisfies  the  estimate  ||k||c2  <c3,  where 
c3(c1,c)>0.  In  order  to  establish  the  inequality  (8),  we  must  estimate  h. 
For  this  we  must  first  show  that  the  substitution  (3)  is  invertible. 

Fix  a  positive  number  a. 


Lemma.  If  e  is  sufficiently  small,  then  the  restriction  of  the  mapping  (3)94 

I-*I+{;k,  where  \k\C2(G)  <  c3, 

94  For  any  fixed  value  of  the  parameter  <p. 
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to  the  region  G  —  a  ( consisting  of  points  whose  ct-neighborhood  is  contained 
in  G)  is  a  diffeomorphism.  The  inverse  diffeomorphism  (5)  in  the  region 
G  -  2a  satisfies  the  estimate  ||hj|c2  <  c4  with  some  constant  c4(a,  c3)  >  0. 

Proof.  The  necessary  estimate  follows  directly  from  the  implicit  function 
theorem.  The  only  difficulty  is  in  verifying  that  the  map  I  — ►  I  +  ek  is  one- 
to-one  in  the  region  G  -  a.  We  note  that  the  function  k  satisfies  a  Lipschitz 
condition  (with  some  constant  L(a,  c3))  in  G  —  a.  Consider  two  points 
It,  I2  in  G  —  a.  For  sufficiently  small  e  (namely,  for  Le  <  1)  the  distance 
between  ck(Ii)  and  t:k(I2)  will  be  smaller  than  |It  -  I2|.  Therefore, 
I,  +  £k(Ij)  ^  I2  +  ck(I2).  Thus  the  map  (3)  is  one-to-one  on  G  -  a,  and 
the  lemma  is  proved.  O 

It  follows  from  the  lemma  that  for  e  small  enough  all  the  estimates  (8) 
are  satisfied.  Thus  the  estimate  (7)  is  also  true. 

We  now  compare  the  system  of  differential  equations  for  J 

(2)  j  =  eg(J) 

and  for  P;  the  latter,  in  view  of  (9),  takes  the  form 

(6')  P  =  £g(P)  +  R 

Since  the  difference  between  the  right  sides  is  of  order  <  e2  (cf.  (7)),  for  time 
t  <  l/e  the  difference  |  P  —  J  |  between  the  solutions  is  of  order  e  (Figure  226). 
On  the  other  hand,  |I-P|  =  e|k|<e.  Thus,  for  t  <  1/c,  the  difference 
1 1  —  J  |  is  of  order  <e,  as  was  to  be  proved.  D 


Figure  226  Proof  of  the  theorem  on  averaging 


To  find  an  accurate  estimate,  we  introduce  the  quantity 

(10)  z(t)  =  P(f)  -  J(f). 

Then  (6  )  and  (9)  imply 

t  =  dg(P)  -  i(J))  +  R  =  >  +  R'. 

C  P 

where  |R'|  <  c2e2  +  c5c|z|  if  the  segment  (P,  J)  lies  in  G  -  a.  Under  this  assumption  we  find 

(11)  |zj  <  <v;|z|  +  f2r2  (where  t6  =  cs  +  t , ) 

1 2(0)  I  <  c3e. 
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Lemma.  //  |z|  <  a|z|  +  b  and  |z(0)|  <  d  for  a,  b.d,t>  0,  then  |z(f)|  <  (d  +  btje*. 

Proof.  |z(r)|  is  no  greater  than  the  solution  y(r)  of  the  equation  y  =  ay  +  b ,  y(0)  =  d.  Solving 

this  equation,  we  find  y  =  Ce“\  CeB'  =  b.C  =  e~“%  C(0)  =  d,  C  <  d  +  bt.  □ 

Now  from  (11)  and  the  assumption  that  the  segment  (P,  J)  lies  in  G  -  a  (Figure  226),  we  have 

lz(f)l  <  (c3£  +  c2Ezt)eCbe'. 

From  this  it  follows  that,  for  0  <  t  <  1/e, 

|z(r)|  <  c7e  c7  =  (c3  +  c2)eC6. 

We  see  that,  if  a  =  d/3  and  e  is  small  enough,  the  entire  segment  (P(f),  J(t))(t  <  1/e)  lies  inside 
G  —  ot  and,  therefore, 

|P(r)  -  J(f)|  <  cgf;  for  all  0  <  t  < 

F 

On  the  other  hand,  |  P(r)  -  1(f)  |  <  |  ek  |  <  cs  e.  Thus,  for  all  t  with  0  <  t  <  1  /e, 

I  1(f)  —  J(f)|  <  Cg£  Cg  =  Cg  +  t'j  >  0 

and  the  theorem  is  proved.  □ 


E  Adiabatic  invariants 


Consider  a  hamiltonian  system  with  one  degree  of  freedom,  with  hamiltonian 
function  H(p,q  ;  A)  depending  on  a  parameter  A.  As  an  example,  we  can  take 
a  pendulum : 


H  = 


+ 


as  the  parameter  X  we  can  take  the  length  l  or  the  acceleration  of  gravity  g. 
Suppose  that  the  parameter  changes  slowly  with  time.  It  turns  out  that  in 
the  limit  as  the  rate  of  change  of  the  parameter  approaches  0,  there  is  a 
remarkable  asymptotic  phenomenon:  two  quantities,  generally  independent, 
become  functions  of  one  another. 

Assume,  for  example,  that  the  length  of  the  pendulum  changes  slowly 
(in  comparison  with  its  characteristic  oscillations).  Then  the  amplitude 
of  its  oscillation  becomes  a  function  of  the  length  of  the  pendulum.  If  we 
very  slowly  increase  by  a  factor  of  two  the  length  of  the  pendulum  and  then 
very  slowly  decrease  it  to  the  original  value,  then  at  the  end  of  this  process 
the  amplitude  of  the  oscillation  will  be  the  same  as  it  was  at  the  start. 

Furthermore,  it  turns  out  that  the  ratio  of  the  energy  H  of  the  pendulum 
to  the  frequency  co  changes  very  little  under  a  slow  change  of  the  parameter, 
although  the  energy  and  frequency  themselves  may  change  a  lot.  Quantities 
such  as  this  ratio,  which  change  little  under  slow  changes  of  parameter, 
are  called  by  physicists  adiabatic  invariants. 

It  is  easy  to  see  that  the  adiabatic  invariance  of  the  ratio  of  the  energy 
of  a  pendulum  to  its  frequency  is  an  assertion  of  a  physical  character,  i.e.,  it  is 
untrue  without  further  assumptions.  In  fact,  if  we  vary  the  length  of  a 
pendulum  arbitrarily  slowly,  but  chose  the  phase  of  oscillation  under  which 
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Figure  227  Adiabatic  change  in  the  length  of  a  pendulum 

the  length  increases  and  decreases,  we  can  set  the  pendulum  swinging 
(parametric  resonance).  In  view  of  this,  physicists  have  suggested  formulating 
the  definition  of  adiabatic  invariance  as  follows:  the  person  changing  the 
parameters  of  the  system  must  not  see  what  state  the  system  is  in  (Figure  227). 
Giving  this  definition  a  rigorous  mathematical  meaning  is  a  very  delicate 
and  as  yet  unsolved  problem.  Fortunately,  we  can  get  along  with  a  surrogate. 
The  assumption  of  ignorance  of  the  internal  state  of  the  system  on  the  part 
of  the  person  controlling  the  parameter  may  be  replaced  by  the  requirement 
that  the  change  of  parameter  must  be  smooth,  i.e.,  twice  continuously 
differentiable. 

More  precisely,  let  H(p,  q;  A)  be  a  fixed,  twice  continuously  differentiable 
function  of  A.  Set  A  =  et  and  consider  the  resulting  system  with  slowly 
varying  parameter  A  =  et : 

8H  .  dH 

(*)  p  =  -  —  q  =  -z~,  H  =  H(p,  q  \ Et). 

dq  op 

Definition.  The  quantity  I(p,  q;  A)  is  an  adiabatic  invariant  of  the  system  (*) 
if  for  every  k  >  0  there  is  an  e0  >  0  such  that  if  0  <  e  <  £o  an(^  0  <  t  <  1/e, 
then 

|  /(p(r),  q(t);  et)  -  l(p(0),  q( 0);  0)|  <  k. 

Clearly,  every  first  integral  is  also  an  adiabatic  invariant.  It  turns  out  that 
every  one-dimensional  system  (*)  has  an  adiabatic  invariant.  Namely,  the 
adiabatic  invariant  is  the  action  variable  in  the  corresponding  problem 
with  constant  coefficients. 

Assume  that  the  phase  trajectories  of  the  system  with  hamiltonian 
H(p,  q\  A)  are  closed.  We  define  a  function  I(p,  q;  A)  in  the  following  way. 
For  fixed  A  there  is  a  phase  portrait  corresponding  to  the  hamiltonian  function 
H(p,  q  \  A)  (Figure  228).  Consider  the  closed  phase  trajectory  passing  through 
a  point  ( p ,  q).  It  bounds  some  region  in  the  phase  plane.  We  denote  the  area 
of  this  region  by  2nl(p,  q\  X).  /  =  const  on  every  phase  trajectory  (for 
given  A).  Clearly,  1  is  nothing  but  the  action  variable  (cf.  Section  50). 

Theorem.  If  the  frequency  co( /,  A)  of  the  system  (*)  is  nowhere  zero,  then 
I(p ,  q ;  A)  is  an  adiabatic  invariant. 
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P 


9 

- ► 


Figure  228  Adiabatic  invariant  of  a  one-dimensional  system 
F  Proof  of  the  adiabatic  invariance  of  action 

For  fixed  X  we  can  introduce  action-angle  variables  /,  cp  into  the  system  (*) 
by  a  canonical  transformation  depending  on  X:  p,  q  -*■  I,  (p;  8p  =  co(I,  X), 
i  =  0;  aj(I,  X)  =  8HJ8I ,  H0  =  H0(I,  X). 

We  denote  by  5(7,  q\  X)  the  (multiple- valued)  generating  function  of  this 
transformation : 


8S  8S 

P  ~  8q  9  ~  dl' 

Now  let  X  =  et.  Since  the  change  from  variables  p,  q  to  variables  I,  q>  is  now 
performed  by  a  time  dependent  canonical  transformation,  the  equations  of 
motion  in  the  new  variables  I,  (p  have  the  hamiltonian  form,  but  with 
hamiltonian  function  (cf.  Section  45A) 


K  —  H0  +  —  —  H0  +  e 


8S 

8X' 


Problem,  Show  that  dS(I,  q ;  k)jdk  is  a  single-valued  function  on  the  phase  plane. 
Him.  S  is  determined  up  to  the  addition  of  multiples  of  2nl. 


In  this  way  we  obtain  the  equations  of  motion  in  the  form 


(p  =  co(I,  A)  +  £/(/,  <j o;  X) 


I  =  eg(I,  q>\  X) 
X  =  £ 


/  = 


9  = 


82S 

81  ax' 

82S 

~8^8X’ 


Since  co  ^  0,  the  averaging  theorem  (Section  52C)  is  applicable.  The 
averaged  system  has  the  form 

J  —  eg  A  —  £. 


But  g  =  (8/8(p){8S/8X),  and  8S/8X  is  a  single- valued  function  on  the  circle 
/  =  const.  Therefore,  g  =  (2n)~ 1  j  g  d(p  =  0,  and  in  the  averaged  system  J 
does  not  change  at  all :  J(t)  =  J( 0). 
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By  the  averaging  theorem,  \I(t)  -  /(0)|  <  ce  for  all  t  with  0  <  t  <  1/e, 
as  was  to  be  proved.  ^ 


Example.  For  a  harmonic  oscillator  (cf.  Figure  217), 


H  - 


1  J2hj2h  h 

^rn~ — ^r  = 

2n  a  b  co 


co  =  ab, 


i.e.,  the  ratio  of  energy  to  frequency  is  an  adiabatic  invariant. 


u 


p 


Figure  229  Adiabatic  invariant  of  an  absolutely  elastic  ball  between  slowly  changing 
walls 

Problem.  The  length  of  a  pendulum  is  slowly  doubled  (1  =  10(1  +  £t), 
0  <  t  <  1/e).  How  does  the  amplitude  qmax  of  the  oscillations  vary? 

Solution.  I  =  ?l3l2gll2qlaxl  therefore, 

*?max(0 

As  a  second  example,  consider  the  motion  of  a  perfectly  elastic  rigid  ball 
of  mass  1  between  perfectly  elastic  walls  whose  separation  l  slowly  varies 
(Figure  229).  We  may  consider  that  a  point  is  moving  in  an  “infinitely  deep 
rectangular  potential  well,”  and  that  the  phase  trajectories  are  rectangles 
of  area  2 vl,  where  v  is  the  velocity  of  the  ball.  In  this  case  the  product  vl 
of  the  velocity  of  the  ball  and  the  distance  between  the  walls  turns  out  to  be 
an  adiabatic  invariant.95  Thus  if  we  make  the  walls  twice  as  close  together, 
the  velocity  of  the  ball  doubles,  and  if  we  separate  the  walls,  the  velocity 
decreases. 

95  This  does  not  formally  follow  from  the  theorem,  since  the  theorem  concerns  smooth  systems 
without  shocks.  The  proof  of  the  adiabatic  invariance  of  vl  in  this  system  is  an  instructive  elemen¬ 
tary  problem. 
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From  a  sheet  of  paper,  one  can  form  a  cone  or  a  cylinder,  but  it  is  impossible 
to  obtain  a  piece  of  a  sphere  without  folding,  stretching,  or  cutting.  The  reason 
lies  in  the  difference  between  the  “intrinsic  geometries”  of  these  surfaces:  no 
part  of  the  sphere  can  be  isometrically  mapped  onto  the  plane. 

The  invariant  which  distinguishes  riemannian  metrics  is  called  riemannian 
curvature.  The  riemannian  curvature  of  a  plane  is  zero,  and  the  curvature  of 
a  sphere  of  radius  R  is  equal  to  R~ 2 .  If  one  riemannian  manifold  can  be  iso¬ 
metrically  mapped  to  another,  then  the  riemannian  curvature  at  correspond¬ 
ing  points  is  the  same.  For  example,  since  a  cone  or  cylinder  is  locally  iso¬ 
metric  to  the  plane,  the  riemannian  curvature  of  the  cone  or  cylinder  at  any 
point  is  equal  to  zero.  Therefore,  no  region  of  a  cone  or  cylinder  can  be  mapped 
isometrically  to  a  sphere. 

The  riemannian  curvature  of  a  manifold  has  a  very  important  influence 
on  the  behavior  of  geodesics  on  it,  i.e.,  on  motion  in  the  corresponding 
dynamical  system.  If  the  riemannian  curvature  of  a  manifold  is  positive  (as 
on  a  sphere  or  ellipsoid),  then  nearby  geodesics  oscillate  about  one  another 
in  most  cases,  and  if  the  curvature  is  negative  (as  on  the  surface  of  a  hyper¬ 
boloid  of  one  sheet),  geodesics  rapidly  diverge  from  one  another. 

In  this  appendix  we  define  riemannian  curvature  and  briefly  discuss  the 
properties  of  geodesics  on  manifolds  of  negative  curvature.  A  further  treat¬ 
ment  of  riemannian  curvature  can  be  found  in  the  book,  “Morse  Theory” 
by  John  Milnor,  Princeton  University  Press,  1963,  and  a  treatment  of 
geodesics  on  manifolds  of  negative  curvature  in  D.  V.  Anosov’s  book, 
“Geodesic  flows  on  closed  riemannian  manifolds  with  negative  curvature,” 
Proceedings  of  the  Steklov  Institute  of  Mathematics,  No.  90  (1967),  Am. 
Math.  Soc.,  1969. 

A  Parallel  translation  on  surfaces 

The  definition  of  riemannian  curvature  is  based  on  the  construction  of  parallel 
translation  of  vectors  along  curves  on  a  riemannian  manifold. 

We  begin  with  the  case  when  the  given  riemannian  manifold  is  two- 
dimensional,  i.e.,  a  surface,  and  the  given  curve  is  a  geodesic  on  this  surface. 
[See  Carmo,  Manfredo  Perdigao  do.  Differential  Geometry  of  Curves  and 
Surfaces,  Prentice-Hall,  1976.  (Translator’s  note)] 

Parallel  translation  of  a  vector  tangent  to  the  surface  along  a  geodesic  on 
this  surface  is  defined  as  follows:  the  point  of  origin  of  the  vector  moves  along 
the  geodesic,  and  the  vector  itself  moves  continuously  so  that  its  angle  with 
the  geodesic  and  its  length  remain  constant.  By  translating  to  the  endpoint 
of  the  geodesic  all  vectors  tangent  to  the  surface  at  the  initial  point,  we  obtain 
a  map  from  the  tangent  plane  at  the  initial  point  to  the  tangent  plane  at  the 
endpoint.  This  map  is  linear  and  isometric. 

We  now  define  parallel  translation  of  a  vector  on  a  surface  along  a  broken 
line  consisting  of  several  geodesic  arcs  (Figure  230).  In  order  to  translate  a 
vector  along  a  broken  line,  we  translate  it  from  the  first  vertex  to  the  second 
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Figure  230  Parallel  translation  along  a  broken  geodesic 


along  the  first  geodesic  arc,  then  translate  this  vector  along  the  second  arc 
to  the  next  vertex,  etc. 

Problem.  Given  a  vector  tangent  to  the  sphere  at  one  vertex  of  a  spherical  triangle  with  three 
right  angles,  translate  this  vector  around  the  triangle  and  back  to  the  same  vertex. 


Answer.  As  a  result  of  this  translation  the  tangent  plane  to  the  sphere  at  the  initial  vertex  will 
be  turned  by  a  right  angle. 

Finally,  parallel  translation  of  a  vector  along  any  smooth  curve  on  a  surface 
is  defined  by  a  limiting  procedure,  in  which  the  curve  is  approximated  by 
broken  lines  consisting  of  geodesic  arcs. 


Problem.  Translate  a  vector  directed  towards  the  North  Pole  and  located  at  Leningrad  (latitude 
A  =  60°)  around  the  60th  parallel  and  back  to  Leningrad,  moving  to  the  east. 

Answer.  The  vector  turns  through  the  angle  2n  ( 1  —  sin  A),  i.e.,  approximately  50°  to  the  west. 
Thus  the  size  of  the  angle  of  rotation  is  proportional  to  the  area  bounded  by  our  parallel,  and 
the  direction  of  rotation  coincides  with  the  direction  the  origin  of  the  vector  is  going  around  the 
North  Pole. 

Hint.  It  is  sufficient  to  translate  the  vector  along  the  same  circle  on  the  cone  formed  by  the 
tangent  lines  to  the  meridian,  going  through  all  the  points  of  the  parallel  (Figure  23 1 ).  This  cone 
then  can  be  unrolled  onto  the  plane,  after  which  parallel  translation  on  its  surface  becomes 
ordinary  parallel  translation  on  the  plane. 


Figure  231  Parallel  translation  on  the  sphere 
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Example.  We  consider  the  upper  half-plane  y  >  0  of  the  plane  of  complex  numbers  z  =  x  +  iy 
with  the  metric 


,  2  dx2  +  dy 2 
ds2  -  - - — — . 

y 

It  is  easy  to  compute  that  the  geodesics  of  this  two-dimensional  riemannian  manifold  are  circles 
and  straight  lines  perpendicular  to  the  x-axis.  Linear  fractional  transformations  with  real 
coefficients 

az  +  b 
cz  +  d 

are  isometric  transformations  of  our  manifold,  which  is  called  the  Lobachevsky  plane. 

Problem.  Translate  a  vector  directed  along  the  imaginary  axis  at  the  point  z  =  /  to  the  point 
z  =  t  +  i  along  the  horizontal  line  (dy  =  0)  (Figure  232). 


Answer.  Under  translation  by  t  the  vector  turns  t  radians  in  the  direction  from  the  y-axis  towards 
the  x-axis. 


Figure  232  Parallel  translation  on  the  Lobachevsky  plane 
B  The  curvature  form 

We  will  now  define  the  riemannian  curvature  at  each  point  of  a  two-dimen¬ 
sional  riemannian  manifold  (i.e.,  a  surface).  For  this  purpose,  we  choose  an 
orientation  of  our  surface  in  a  neighborhood  of  the  point  under  consideration 
and  consider  parallel  translation  of  vectors  along  the  boundary  of  a  small 
region  D  on  our  surface.  It  is  easy  to  calculate  that  the  result  of  such  a  trans¬ 
lation  is  rotation  by  a  small  angle.  We  denote  this  angle  by  <p(D)  (the  sign  of  the 
angle  is  fixed  by  the  choice  of  orientation  of  the  surface). 

If  we  divide  the  region  D  into  two  parts  Dj  and  D2,  the  result  of  parallel 
translation  along  the  boundary  of  D  can  be  obtained  by  first  going  around 
one  part,  and  then  the  other.  Thus, 

<P(D)  =  <p(Z>i)  +  (p{D2), 

i.e.,  the  angle  q>  is  an  additive  function  of  regions.  When  we  change  the  direc¬ 
tion  of  travel  along  the  boundary,  the  angle  <p  changes  sign.  It  is  natural 
therefore  to  represent  <j o(D)  as  the  integral  over  D  of  a  suitable  2-form.  Such 
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a  2-form  in  fact  exists ;  it  is  called  the  curvature  form,  and  we  denote  it  by  0. 
Thus  we  define  the  curvature  form  £2  by  the  relation 


(1) 


Q. 


The  value  of  Q  on  a  pair  of  tangent  vectors  rj  in  T Mx  can  be  defined  in  the 
following  way.  We  identify  a  neighborhood  of  the  point  0  in  the  tangent  space 
to  M  at  x  with  a  neighborhood  of  the  point  x  on  M  (using,  for  example, 
some  local  coordinate  system).  We  can  then  construct  on  M  the  parallelogram 
n£  spanned  by  the  vectors  vf  07 ,  at  least  for  sufficiently  small  e. 

Now  the  value  of  the  curvature  form  on  our  vectors  is  defined  by  the 
formula 


(2) 


Q(£,  rj)  =  lim 

£-►0 


<p(n£) 


In  other  words,  the  value  of  the  curvature  form  on  a  pair  of  tangent  vectors 
is  equal  to  the  angle  of  rotation  under  translation  along  the  infinitely  small 
parallelogram  determined  by  these  vectors. 


Problem.  Find  the  curvature  forms  on  the  plane,  on  a  sphere  of  radius  R,  and  on  the  Lobachevsky 
plane. 

Answer,  =  0,  ft  =  R~ 2  dS,  n=  -dS,  where  the  2-form  dS  is  the  area  element  on  our 
oriented  surface. 

Problem.  Show  that  the  function  defined  by  formula  (2)  is  really  a  differential  2-form,  independent 
of  the  arbitrary  choice  involved  in  the  construction,  and  that  the  rotation  of  a  vector  under 
translation  along  the  boundary  of  a  finite  oriented  region  D  is  expressed,  in  terms  of  this  form, 
by  formula  (1). 

Problem.  Show  that  the  integral  of  the  curvature  form  over  any  convex  surface  in  three-dimen¬ 
sional  euclidean  space  is  equal  to  47t. 

C  The  riemannian  curvature  of  a  surface 

We  note  that  every  differential  2-form  on  a  two-dimensional  oriented 
riemannian  manifold  M  can  be  written  in  the  form  pdS,  where  dS  is  the 
oriented  area  element  and  p  is  a  scalar  function  uniquely  determined  by  the 
choice  of  metric  and  orientation. 

In  particular,  the  curvature  form  can  be  written  in  the  form 

n  =  KdS, 

where  K  :  M  ->  IR  is  a  smooth  function  on  M  and  dS  is  the  area  element. 

The  value  of  the  function  K  at  a  point  x  is  called  the  riemannian  curvature 
of  the  surface  at  x. 

Problem.  Calculate  the  riemannian  curvature  of  the  euclidean  space,  the  sphere  of  radius  R , 
and  the  Lobachevsky  plane. 
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Answer.  K  =  0,  K  =  K  =  - 1. 

Problem.  Show  that  the  riemannian  curvature  does  not  depend  on  the  orientation  of  the  mani¬ 
fold,  but  only  on  its  metric. 

Hint.  The  2-forms  Q  and  dS  both  change  sign  under  a  change  of  orientation. 

Problem.  Show  that,  for  surfaces  in  ordinary  three-dimensional  euclidean  space,  the  riemannian 
curvature  at  every  point  is  equal  to  the  product  of  the  inverses  of  the  principal  radii  of  curvature 
(with  minus  sign  if  the  centers  of  curvature  lie  on  opposite  sides  of  the  surface). 


We  note  that  the  sign  of  a  manifold’s  curvature  at  a  point  does  not  depend 
on  the  orientation  of  the  manifold;  this  sign  may  be  defined  without  using  the 
orientation  at  all. 

Namely,  on  manifolds  of  positive  curvature,  a  vector  parallel  translated 
around  the  boundary  of  a  small  region  turns  around  its  origin  in  the  same 
direction  as  the  point  on  the  boundary  goes  around  the  region;  on  manifolds 
of  negative  curvature  the  direction  of  rotation  is  opposite. 

We  note  further  that  the  value  of  the  curvature  at  a  point  is  determined 
by  the  metric  in  a  neighborhood  of  this  point,  and  therefore  is  preserved 
under  bending:  the  curvature  is  the  same  at  corresponding  points  of  iso¬ 
metric  surfaces.  Hence,  riemannian  curvature  is  also  called  intrinsic  curvature. 

The  formulas  for  computing  curvature  in  terms  of  components  of  the 
metric  in  some  coordinate  system  involve  the  second  derivatives  of  the  metric 
and  are  rather  complicated :  cf.  the  problems  in  Section  G  below. 

D  Higher-dimensional  parallel  translation 

The  construction  of  parallel  translation  on  riemannian  manifolds  of  di¬ 
mension  greater  than  two  is  somewhat  more  complicated  than  the  two- 
dimensional  construction  presented  above.  The  reason  is  that  in  these 
dimensions  the  direction  of  the  vector  being  translated  is  no  longer  determined 
by  the  condition  that  the  angle  with  a  geodesic  be  invariant.  In  fact,  the  vector 
could  rotate  around  the  direction  of  the  geodesic  while  preserving  its  angle 
with  the  geodesic. 

The  refinement  which  we  must  introduce  into  the  construction  of  parallel 
translation  along  a  geodesic  is  the  choice  of  a  two-dimensional  plane  passing 
through  the  tangent  to  the  geodesic,  which  must  contain  the  translated  vector. 
This  choice  is  made  in  the  following  (unfortunately  complicated)  way. 

At  the  initial  point  of  a  geodesic  the  needed  plane  is  the  plane  spanned  by 
the  vector  to  be  translated  and  the  direction  vector  of  the  geodesic.  We  look 
at  all  geodesics  proceeding  from  the  initial  point,  in  directions  lying  in  this 
plane.  The  set  of  all  such  geodesics  (close  to  the  initial  point)  forms  a-smooth 
surface  which  contains  the  geodesic  along  which  we  intend  to  translate  the 
vector  (Figure  233). 

Consider  a  new  point  on  the  geodesic  at  a  small  distance  A  from  the  initial 
point.  The  tangent  plane  at  the  new  point  to  the  surface  described  above 
contains  the  direction  of  the  geodesic  at  this  new  point.  We  take  this  new 
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Figure  233  Parallel  translation  in  space 


point  as  the  initial  point  and  use  its  tangent  plane  to  construct  a  new  surface 
(formed  by  the  bundle  of  geodesics  emanating  from  the  new  point).  This 
surface  contains  the  original  geodesic.  We  move  along  the  original  geodesic 
again  by  A  and  repeat  the  construction  from  the  beginning. 

After  a  finite  number  of  steps  we  can  reach  any  point  of  the  original  geo¬ 
desic.  As  a  result  of  our  work  we  have,  at  every  point  of  the  geodesic,  a  tangent 
plane  containing  the  direction  of  the  geodesic.  This  plane  depends  on  the 
length  A  of  the  steps  in  our  construction.  As  A  -»  0  the  family  of  tangent 
planes  obtained  converges  (as  can  be  calculated)  to  a  definite  limit.  As  a 
result  we  have  a  field  of  two-dimensional  tangent  planes  along  our  geodesic 
containing  the  direction  of  the  geodesic  and  determined  in  an  intrinsic 
manner  by  the  metric  on  the  manifold. 

Now  parallel  translation  of  our  vector  along  a  geodesic  is  defined  as  in  the 
two-dimensional  case:  under  translation  the  vector  must  remain  in  the  planes 
described  above;  its  length  and  its  angle  with  the  direction  of  the  geodesic 
must  be  preserved.  Parallel  translation  along  any  curve  is  defined  using 
approximations  by  geodesic  polygons,  as  in  the  two-dimensional  case. 

Problem.  Show  that  parallel  translation  of  vectors  from  one  point  of  a  riemannian  manifold 
to  another  along  a  fixed  path  is  a  linear  isometric  operator  from  the  tangent  space  at  the  first 
point  to  the  tangent  space  at  the  second  point. 

Problem.  Parallel  translate  any  vector  along  the  line 

.v,  =  i  x2  =  0  y  =  1  (0  <  t  <  r) 

in  a  Lobachevsky  space  with  metric 

,  ,  dxf  +  dx i  +  dy2 

ds 2  = - - - . 

y 

Answer.  Vectors  in  the  directions  of  the  x ,  and  y  axes  are  rotated  by  angle  t  in  the  plane  spanned 
by  them  (rotation  is  in  the  direction  from  the  y-axis  towards  the  X]-axis);  vectors  in  the  x2-direc- 
tion  are  carried  parallel  to  themselves  in  the  sense  of  the  euclidean  metric. 

E  The  curvature  tensor 

We  now  consider,  as  in  the  two-dimensional  case,  parallel  translation  along 
small  closed  paths  beginning  and  ending  at  a  point  of  a  riemannian  manifold. 
Parallel  translation  along  such  a  path  returns  vectors  to  the  original  tangent 
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space.  The  map  of  the  tangent  space  to  itself  thus  obtained  is  a  small  rotation 
(an  orthogonal  transformation  close  to  the  identity). 

In  the  two-dimensional  case  we  characterized  this  rotation  by  one  number  -the  angle  of  rotation 
<P-  in  higher  dimensions  a  skew-symmetric  operator  plays  the  role  of  <p.  Namely,  any  orthogonal 
operator  A  which  is  close  to  the  identity  can  be  written  in  a  natural  way  in  the  form 

<p2 

A  =  e*  =  E  +  <S>  +  ~  +  --, 
w here  ®  is  a  small  skew-symmetric  operator. 

Problem.  Compute  ®  if  A  is  a  rotation  of  the  plane  through  a  small  angle  <p. 


Answer. 


A  = 


cos  (p  sin  <p 
—  sin  <p  cos  ip. 


Unlike  in  the  two-dimensional  case,  the  function  ®  is  not  generally  additive  (since  the 
orthogonal  group  of  rt-space  for  n  >  2  is  not  commutative).  Nevertheless,  we  can  construct  a 
curvature  form  using  <P,  describing  the  “infinitely  small  rotation  caused  by  parallel  translation 
around  an  infinitely  small  parallelogram’’  in  the  same  way  as  in  the  two-dimensional  case,  i.e., 
using  formula  (2). 


Thus,  let  and  rj  in  TMX  be  vectors  tangent  to  the  riemannian  manifold 
M  at  the  point  x.  Construct  a  small  curvilinear  parallelogram  FI,  on  M  (the 
sides  of  the  parallelogram  FI£  are  obtained  from  the  vectors  and  vtj  by  a 
coordinate  identification  of  a  neighborhood  of  zero  in  TMX  with  a  neighbor¬ 
hood  of  x  in  M).  We  will  look  at  parallel  translation  along  the  sides  of  the 
parallelogram  n£  (we  begin  the  circuit  at  £). 

The  result  of  translation  will  be  an  orthogonal  transformation  of  TMX, 
close  to  the  identity.  It  differs  from  the  identity  transformation  by  a  quantity 
of  order  e2  and  has  the  form 


4,(6  17)  =  E  +  £2Q  +  o(e2), 

where  Q  is  a  skew-symmetric  operator  depending  on  £  and  17.  Therefore,  we 
can  define  a  function  Q  of  pairs  of  vectors  rj  in  the  tangent  space  at  x  with 
values  in  the  space  of  skew-symmetric  operators  on  TMX  by  the  formula 


£KZ,ri)  =  \im 

£  — *  0 


4£(6  n) 

o2 


E 


Problem.  Show  that  the  function  is  a  differential  2-form  (with  values  in  the  skew-symmetric 
operators  on  TMX )  and  does  not  depend  on  the  choice  of  coordinates  we  used  to  identify  TMX 
and  M. 

The  form  Q  is  called  the  curvature  tensor  of  the  riemannian  manifold. 
We  could  say  that  the  curvature  tensor  describes  the  infinitesimal  rotation 
in  the  tangent  space  obtained  by  parallel  translation  around  an  infinitely 
small  parallelogram. 
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F  Curvature  in  a  two-dimensional  direction 

Consider  a  two-dimensional  subspace  L  in  the  tangent  space  to  a  riemannian 
manifold  at  some  point.  We  take  geodesics  emanating  from  this  point  in 
all  the  directions  in  L.  These  geodesics  form  a  smooth  surface  close  to  our 
point.  The  surface  constructed  lies  in  the  riemannian  manifold  and  has  an 
induced  riemannian  metric. 

By  the  curvature  of  a  riemannian  manifold  M  in  the  direction  of  a  2-plane 
L  in  the  tangent  space  to  M  at  a  point  x,  we  mean  the  riemannian  curvature  at 
x  of  the  surface  described  above. 

Problem.  Find  the  curvatures  of  a  three-dimensional  sphere  of  radius  R  and  of  Lobachevsky 
space  in  all  possible  two-dimensional  directions. 

Answer.  R~2,  - 1. 

In  general,  the  curvatures  of  a  riemannian  manifold  in  different  two- 
dimensional  directions  are  different.  Their  dependence  on  the  direction  is 
described  by  formula  (3)  below. 

Theorem.  The  curvature  of  a  riemannian  manifold  in  the  two-dimensional 
direction  determined  by  a  pair  of  orthogonal  vectors  r]  of  length  1  can  be 

expressed  in  terms  of  the  curvature  tensor  Q  by  the  formula 

(3)  K  =  <G(Z,ri)Z,ti>, 

where  the  brackets  denote  the  scalar  product  giving  the  riemannian  metric. 


The  proof  is  obtained  by  comparing  the  definitions  of  the  curvature  tensor  and  of  curvature 
in  a  two-dimensional  direction.  We  will  not  go  into  it  in  a  rigorous  way.  It  is  possible  to  take 
formula  (3)  for  the  definition  of  the  curvature  K. 

G  Covariant  differentiation 

Connected  with  parallel  translation  along  curves  in  a  riemannian  manifold 
is  a  particular  differential  calculus— so-called  covariant  differentiation,  or 
the  riemannian  connection.  We  define  this  differentiation  in  the  following 
way. 

Let  £  be  a  vector  tangent  to  a  riemannian  manifold  M  at  a  point  x,  and  v 
a  vector  field  given  on  M  in  a  neighborhood  of  x.  The  covariant  derivative 
of  the  field  v  in  the  direction  £  is  defined  by  using  any  curve  passing  through  x 
with  velocity  £.  After  moving  along  this  curve  for  a  small  interval  of  time  t, 
we  find  ourselves  at  a  new  point  x(t).  We  take  the  vector  field  v  at  this  point 
x(t)  and  parallel  translate  it  backwards  along  the  curve  to  the  original  point 
x.  We  obtain  a  vector  depending  on  t  in  the  tangent  space  to  M  at  x.  For 
t  =  0  this  vector  is  i?(x),  and  for  other  t  it  changes  according  to  the  non- 
parallelness  of  the  vector  field  v  along  our  curve  in  the  direction  £. 
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Consider  the  derivative  of  the  resulting  vector  with  respect  to  t,  evaluated 
at  (  =  0.  This  derivative  is  a  vector  in  the  tangent  space  TMX.  It  is  called  the 
covariant  derivative  of  the  field  v  along  £  and  is  denoted  by  V^v.  It  is  easy  to 
verify  that  the  vector  Vtv  does  not  depend  on  the  choice  of  curve  specified  in 
the  definition,  but  only  on  £  and  v. 


Problem  1.  Prove  the  following  properties  of  covariant  differentiation: 

1.  Viv  is  a  bilinear  function  of  £  and  v. 

2.  Vtfv  =  ( Lif)v  +  /(x)V{i>,  where  /  is  a  smooth  function  and  Lif  is  the  derivative  of  /  in  the 
direction  of  the  vector  £  in TMX. 

3.  L,(v,  w>  =  <V?r,  w(x)>  +  <i>(x),  V,,w>. 

4.  =  fw’  ”](*)  (where  L[w  v]  =  LVLV  -  LWLV). 

Problem  2.  Show  that  the  curvature  tensor  can  be  expressed  in  terms  of  co  variant  differentiation 
in  the  following  way: 


IX6>.»foXo  =  -V4V,C  +  V,V{C  +  Vh,aC, 

where  t],  £  are  any  vector  fields  whose  values  at  the  point  under  consideration  are  £a,  ij0,  and  £0. 
Problem  3.  Show  that  the  curvature  tensor  satisfies  the  following  identities: 

1X6  nK  +  Ofo.CK  +  txc,  On  =  o 
<0(6  n>,  =  <fX«,  «6  n>- 

Problem  4.  Suppose  that  the  riemannian  metric  is  given  in  local  coordinates  x1? . . . ,  xn  by  the 
symmetric  matrix  g lj: 

ds2  =  X  Si jdx.dxj. 


Denote  by  e,, . . . ,  e„  the  coordinate  vector  fields  (so  that  differentiation  in  the  direction  e,  is 
5,  =  djdxj  Then  covariant  derivatives  can  be  calculated  using  the  formulas  in  Problem  1  and 
the  following  formulas: 

v',ej  =  X  n>k  n,  =  X  KASji  +  dj9ii  -  d,gu)glk, 

k  l 

where  (glt)  is  the  inverse  matrix  to  ( gkl ). 

By  using  the  expression  for  the  curvature  tensor  in  terms  of  the  connection  in  Problem  2, 
we  also  obtain  an  explicit  formula  for  the  curvature.  The  numbers  RiJkl  =  (fife,,  ^i)  are 
called  the  components  of  the  curvature  tensor. 

H  The  Jacobi  equation 

The  riemannian  curvature  of  a  manifold  is  closely  connected  with  the  be¬ 
havior  of  its  geodesics.  In  particular,  let  us  consider  a  geodesic  passing 
through  some  point.in  some  direction,  and  alter  slightly  the  initial  conditions, 
i.e.,  the  initial  point  and  initial  direction.  The  new  initial  conditions  determine 
a  new  geodesic.  At  first  this  geodesic  differs  very  little  from  the  original  geo¬ 
desic.  To  investigate  the  divergence  it  is  useful  to  linearize  the  differential 
equation  of  geodesics  close  to  the  original  geodesic.  The  second-order  linear 
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differential  equation  thus  obtained  (“  the  variational  equation  ”  for  the  equa¬ 
tion  of  geodesics)  is  called  the  Jacobi  equation;  it  is  convenient  to  write 
it  in  terms  of  covariant  derivatives  and  curvature  tensors. 

We  denote  by  x(r)  a  point  moving  along  a  geodesic  in  the  manifold  M 
with  velocity  (of  constant  magnitude)  u(f)G  TMx{t).  If  the  initial  condition 
depends  smoothly  on  a  parameter  a,  then  the  geodesic  also  depends  smoothly 
on  the  parameter.  Consider  the  motion  corresponding  to  a  value  of  a.  We 
denote  the  position  of  a  point  at  time  t  on  the  corresponding  geodesic  by 
x(t,  a)  e  M.  We  will  assume  that  the  initial  geodesic  corresponds  to  the  zero 
value  of  the  parameter,  so  that  x(f,  0)  =  x(t). 

The  vector  field  of  geodesic  variation  is  the  derivative  of  the  function 
x(f,  a)  with  respect  to  a,  evaluated  at  a  =  0;  the  value  of  this  field  at  the  point 
x(r)  is  equal  to 

x(t,  a)  =  < *(t)eTMxl,}. 

3=0 

To  write  the  variational  equation,  we  define  the  covariant  derivative  with 
respect  tot  of  a  vector  field  £(r)  given  on  the  geodesic  x(t).  To  define  this,  we 
take  the  vector  £(f  -I-  h ),  parallel  translate  it  from  the  point  x(t  +  h )  to 
x(t)  along  the  geodesic,  differentiate  the  vector  obtained  in  the  tangent  space 
TMx(t)  with  respect  to  h  and  evaluate  at  h  =  0.  The  result  is  a  vector  in 
TMx(t),  which  is  called  the  co variant  derivative  of  the  field  C (t)  with  respect 
to  t,  and  denoted  by  D^/Dt. 

Theorem  The  vector  field  of  geodesic  variation  satisfies  the  second-order  linear 
differential  equation 

D2£ 

(4) 

where  ft  is  the  curvature  tensor,  and  v  =  v(t)  is  the  velocity  vector  of  motion 
along  the  original  geodesic. 

Conversely,  every  solution  of  the  differential  equation  (4)  is  a  field  of 
variation  of  the  original  geodesic. 

Equation  (4)  is  called  the  Jacobi  equation. 

Problem.  Prove  the  theorem  above. 

Problem.  Let  M  be  a  surface,  y(f)  the  magnitude  of  the  component  of  the  vector  £(r)  in  the  direc¬ 
tion  normal  to  a  given  geodesic,  and  let  the  length  of  the  vector  v(t)  be  equal  to  1.  Show  that  y 
satisfies  the  differential  equation 

(5)  y  =  ~  Ky, 

where  K  -  K(t)  is  the  riemannian  curvature  at  the  point  x(t) 

Problem.  Using  Equation  (5),  compare  the  behavior  of  geodesics  close  to  a  given  one  on  the 
sphere  ( K  =  +R  2)  and  on  the  Lobachevsky  plane  ( K  =  —1). 
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I  Investigation  of  the  Jacobi  equation 

In  investigating  the  variational  equations,  it  is  useful  to  disregard  the  trivial 
variations,  i.e.,  changes  of  the  time  origin  and  of  the  magnitude  of  the  initial 
velocity  of  motion.  To  this  end  we  decompose  the  variation  vector  £,  into 
components  parallel  and  perpendicular  to  the  velocity  vector  v.  Then  (since 
t>)  =  0  and  since  the  operator  Q(u,  £)  is  skew-symmetric)  for  the  normal 
component  we  again  get  the  Jacobi  equation,  and  for  the  parallel  component 
we  get  the  equation 


D2% 


=  0. 


We  now  note  that  the  Jacobi  equation  for  the  normal  component  can  be 
written  in  the  form  of  “Newton’s  equation” 


D2£, 

1ft2 


—  grad  U, 


where  the  quadratic  form  U  of  the  vector  £  is  expressed  in  terms  of  the  curva¬ 
ture  tensor  and  is  proportional  to  the  curvature  K  in  the  direction  of  the 
(c,  v)  plane: 


U(()  =  i<£¥».  Of,  {>  =  iK<{,  f>  <0,  v>. 

Thus  the  behavior  of  the  normal  component  of  the  variation  vector  of  a 
geodesic  with  velocity  1  can  be  described  by  the  equation  of  a  (non-autono- 
mous)  linear  oscillator  whose  potential  energy  is  equal  to  the  product  of  the 
curvature  in  the  direction  of  the  plane  of  velocity  vectors  and  variations  with 
the  square  of  the  length  of  the  normal  component  of  the  variation. 

In  particular  we  consider  the  case  when  the  curvature  is  negative  in  all 
two-dimensional  directions  containing  the  velocity  vector  of  the  geodesic 
(Figure  234).  Then  the  divergence  of  nearby  geodesics  from  the  given  one  in 


Figure  234  Nearby  geodesics  on  manifolds  of  positive  and  negative  curvature 
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the  normal  direction  can  be  described  by  the  equation  of  an  oscillator  with 
negative  definite  (and  time-dependent)  potential  energy.  Therefore,  the 
normal  component  of  divergence  for  nearby  geodesics  behaves  like  the  di¬ 
vergence  of  a  ball,  located  near  the  top  of  a  hill,  from  the  top.  The  equilibrium 
position  of  the  ball  at  the  top  is  unstable.  This  means  that  geodesics  near  the 
given  geodesic  will  diverge  exponentially  from  it. 

If  the  potential  energy  of  the  newtonian  equation  we  obtained  did  not  depend  on  time,  our 
conclusion  would  be  rigorous.  Let  us  assume  further  that  the  curvature  in  the  different  direc¬ 
tions  containing  v  is  in  the  interval 

-a2  <  K  <  -b2,  where  0  <  b  <  a. 

Then  solutions  to  the  Jacobi  equation  for  normal  divergence  will  be  linear  combinations  of 
exponential  curves  with  exponent  ±Xt,  where  the  positive  numbers  A,  are  between  a  and  b. 
Therefore,  every  solution  to  the  Jacobi  equation  grows  at  least  as  fast  as  eb|'1  as  either 
r-»  +  x  or  f  -*■  —  x  ;  most  solutions  grow  even  faster,  with  rate  c"1'1. 

The  instability  of  an  equilibrium  position  under  negative  definite  potential 
energy  is  intuitively  obvious  also  in  the  non-autonomous  case.  It  can  be 
proven  by  comparison  with  a  corresponding  autonomous  system.  As  a 
result  of  such  a  comparison  we  may  convince  ourselves  that  under  motion 
along  a  geodesic,  all  solutions  of  the  Jacobi  equation  for  normal  divergence 
on  a  manifold  of  negative  curvature  grow  at  least  as  fast  as  an  exponential 
function  of  the  distance  traveled,  whose  exponent  is  equal  to  the  square 
root  of  the  absolute  value  of  the  curvature  in  the  two-dimensional  direction 
for  which  this  absolute  value  is  minimal.  In  fact,  most  solutions  grow  even 
faster,  but  we  cannot  now  assert  that  the  exponent  of  growth  for  most  solu¬ 
tions  is  determined  by  the  direction  in  which  the  absolute  value  of  the  nega¬ 
tive  curvature  is  largest. 

In  summary,  we  can  say  that  the  behavior  of  geodesics  on  a  manifold  of 
negative  curvature  is  characterized  by  exponential  instability.  For  numerical 
estimates  of  this  instability,  it  is  useful  to  define  the  characteristic  path  length 
s  as  the  average  path  length  on  which  small  errors  in  the  initial  conditions 
are  increased  e  times. 

More  precisely,  the  characteristic  path  length  s  can  be  defined  as  the  inverse 
of  the  exponent  A  which  characterizes  the  growth  of  the  solution  to  the  Jacobi 
equation  for  normal  divergence  from  the  geodesic  proceeding  with  velocity  1 : 


A  -  lim  —  max  max  ln|£(t)|  s  = 

T^ao*  |t|<  T  R(0)(  =  1  A 

In  general,  the  exponent  A  and  the  path  s  depend  on  the  initial  geodesic. 

If  the  curvature  of  our  manifold  in  all  two-dimensional  directions  is 
bounded  away  from  zero  by  the  number  —  b2,  then  the  characteristic  path 
length  is  less  than  or  equal  to  b~l.  Thus  as  the  curvature  of  a  manifold  gets 
more  negative,  the  characteristic  path  length  s,  on  which  the  instability  of 
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geodesics  is  reduced  to  e-fold  growth  of  error,  gets  smaller.  In  view  of  the 
exponential  character  of  the  growth  of  error,  the  course  of  a  geodesic  on  a 
manifold  of  negative  curvature  is  practically  impossible  to  predict. 

Assume,  for  example,  that  the  curvature  is  negative  and  bounded  away 
from  zero  by  —4m-2.  The  characteristic  path  length  is  less  than  or  equal  to 
half  a  meter,  i.e.,  on  a  geodesic  arc  five  meters  long  the  error  grows  by  approxi¬ 
mately  e10  ~  104  Therefore,  an  error  of  a  tenth  of  a  millimeter  in  the  initial 
conditions  shows  up  in  the  form  of  a  one-meter  difference  at  the  end  of  the 
geodesic. 

J  Geodesic  flows  on  compact  manifolds  of 

negative  curvature 

Let  M  be  a  compact  riemannian  manifold  whose  curvature  at  every  point 
in  every  two-dimensional  direction  is  negative.  (Such  manifolds  exist.) 
Consider  the  inertial  motion  of  a  point  of  mass  1  on  M,  without  any  external 
forces.  The  lagrangian  function  of  this  system  is  equal  to  the  kinetic  energy, 
which  is  equal  to  the  total  energy  and  is  a  first  integral  of  the  equations  of 
motion. 

If  M  has  dimension  n,  then  each  energy  level  manifold  has  dimension 
2n  -  1.  This  manifold  is  a  submanifold  of  the  tangent  bundle  of  M.  For 
example,  we  can  fix  the  value  of  the  energy  at  \  (which  corresponds  to  initial 
velocity  1).  Then  the  velocity  vector  of  the  point  has  length  constantly  equal 
to  1,  and  our  level  manifold  turns  out  to  be  the  fiber  bundle 

T,M  c  TM 

consisting  of  the  unit  spheres  in  the  tangent  spaces  to  M  at  every  point. 

Thus,  a  point  of  the  manifold  fM  is  represented  as  a  vector  of  length 
1  at  a  point  of  M.  By  the  Maupertuis-Jacobi  principle,  we  can  describe  the 
motion  of  a  point  mass  with  fixed  initial  conditions  in  the  following  way: 
the  point  moves  with  velocity  1  along  the  geodesic  determined  by  the  indi¬ 
cated  vector. 

By  the  law  of  conservation  of  energy  the  manifold  7jM  is  an  invariant 
manifold  in  the  phase  space  of  our  system.  Therefore,  our  phase  flow  de¬ 
termines  a  one-parameter  group  of  diffeomorphisms  on  the  (2 n  -  1)- 
dimensional  manifold  TjM.  This  group  is  called  the  geodesic  flow  on  M. 
The  geodesic  flow  can  be  described  as  follows:  the  transformation  at  time  t 
carries  the  unit  vector  £e  7jM  located  at  the  point  x,  to  the  unit  velocity 
vector  of  the  geodesic  coming  from  x  in  the  direction  f  located  at  the  point 
at  distance  t  from  x.  We  note  that  there  is  a  naturally  defined  volume  element 
on  TjM  and  that  the  geodesic  flow  preserves  it  (Liouville’s  theorem). 

Up  to  now  we  have  not  used  the  negative  curvature  of  the  manifold  M. 
But  if  we  investigate  the  trajectories  of  the  geodesic  flow,  it  turns  out  that  the 
negative  curvature  of  M  has  a  strong  impact  on  the  behavior  of  these  tra¬ 
jectories  (this  is  related  to  the  exponential  instability  of  geodesics  on  M). 
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Here  are  some  properties  of  geodesic  flows  on  manifolds  of  negative 

curvature  (for  further  details,  see  the  book  of  D.  V.  Anosov  cited  earlier). 

1.  Almost  all  phase  trajectories  are  dense  in  the  energy  level  manifold  (the 
exceptional  non-dense  trajectories  form  a  set  of  measure  zero). 

2.  Uniform  distribution:  the  amount  of  time  which  almost  every  trajectory 
spends  in  any  region  of  the  phase  space  TjM  is  proportional  to  the  volume 
of  the  region. 

3.  The  phase  flow  gl  has  the  mixing  property :  if  A  and  B  are  two  regions,  then 

lim  mes[(g'T)  n  B]  =  mes  A  mes  B 


(where  mes  denotes  the  volume,  normalized  by  the  condition  that  the 

whole  space  have  measure  1). 

From  these  properties  of  trajectories  in  phase  space  follow  analogous 
statements  about  geodesics  on  the  manifold  itself.  Physicists  call  these 
properties  “stochastic”:  asymptotically  for  large  t  the  trajectories  behave  as 
if  the  point  were  random.  For  example,  the  mixing  property  means  that  the 
probability  of  turning  up  in  B  at  a  time  t  long  after  exiting  from  A  is  propor¬ 
tional  to  the  volume  of  B. 

Thus,  the  exponential  instability  of  geodesics  on  manifolds  of  negative 
curvature  leads  to  the  stochasticity  of  the  corresponding  geodesic  flow. 

K  Other  applications  of  exponential  instability 

The  exponential  instability  property  of  geodesics  on  manifolds  of  negative 
curvature  has  been  studied  by  many  authors,  beginning  with  Hadamard  (and, 
in  the  case  of  constant  curvature,  also  by  Lobachevsky),  but  especially  by 
E.  Hopf.  An  unexpected  discovery  of  the  1960s  in  this  area  was  the  surprising 
stability  of  exponentially  unstable  systems  with  respect  to  perturbations  of  the 
systems  themselves. 

Consider,  for  example,  the  vector  field  giving  the  geodesic  flow  on  a  com¬ 
pact  surface  of  negative  curvature.  As  we  showed  above,  the  phase  curves 
of  this  flow  are  arranged  in  a  complicated  way :  almost  every  one  of  them  is 
dense  in  the  three-dimensional  energy  level  manifold.  The  flow  has  infinitely 
many  closed  trajectories,  and  the  set  of  points  on  closed  trajectories  is  also 
dense  in  the  three-dimensional  energy  level  manifold. 

We  now  consider  a  nearby  vector  field.  It  turns  out  that,  in  spite  of  the 
complexity  of  the  picture  of  phase  curves,  the  entire  picture  with  dense 
phase  curves  and  infinitely  many  closed  trajectories  hardly  changes  at  all  if 
we  pass  to  the  nearby  field.  In  fact,  there  is  a  homeomorphism  close  to  the 
identity  transformation  which  takes  the  phase  curves  of  the  unperturbed 
flow  to  the  phase  curves  of  the  perturbed  flow. 

Thus  our  complicated  phase  flow  has  the  same  property  of  “structural 
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stability”  as  a  limit  cycle,  or  a  stable  focus  in  the  plane.  We  note  that  neither 
a  center  in  the  plane  nor  a  winding  of  the  torus  has  this  property  of  structural 
stability:  the  topological  type  of  the  phase  portrait  in  these  cases  changes 
for  arbitrarily  small  changes  in  the  vector  field. 

The  existence  of  structurally  stable  systems  with  complicated  motions, 
each  of  which  is  in  itself  exponentially  unstable,  is  one  of  the  basic  discoveries 
of  recent  years  in  the  theory  of  ordinary  differential  equations  (the  con¬ 
jecture  that  geodesic  flows  on  manifolds  of  negative  curvature  are  structurally 
stable  was  made  by  S.  Smale  in  1961,  and  the  proof  was  given  by  D.  V. 
Anosov  and  published  in  1967;  the  basic  results  on  stochasticity  of  these 
flows  were  obtained  by  Ya.  G.  Sinai  and  D.  V.  Anosov,  also  in  the  1960s). 

Before  these  works  most  mathematicians  believed  that  in  systems  of 
differential  equations  in  “general  form”  only  the  simplest  stable  limiting 
behaviors  were  possible:  equilibrium  positions  and  cycles.  If  a  system  was 
more  complicated  (for  example,  if  it  was  conservative),  then  it  was  assumed 
that  after  a  small  change  in  its  equations  (for  example,  after  imposing  small 
non-conservative  perturbations)  complicated  motions  are  “dispersed”  into 
simple  ones.  We  now  know  that  this  is  not  so,  and  that  in  the  function  space 
of  vector  fields  there  are  whole  regions  consisting  of  fields  with  more  com¬ 
plicated  behavior  of  phase  curves. 

The  conclusions  which  follow  from  this  are  relevant  to  a  wide  range  of 
phenomena,  in  which  “stochastic”  behavior  of  deterministic  objects  is 
observed. 

Namely,  suppose  that  in  the  phase  space  of  some  (non-conservative) 
system  there  is  an  attracting  invariant  manifold  (or  set)  in  which  the  phase 
curves  have  the  property  of  exponential  instability.  We  now  know  that 
systems  with  such  a  property  are  not  exceptional:  under  small  changes  of  the 
system  this  property  must  persist.  What  is  seen  by  an  experimenter  observing 
motions  of  such  a  system? 

The  approach  of  phase  curves  to  an  attracting  set  will  be  interpreted  as 
the  establishment  of  some  sort  of  limiting  conditions.  The  further  motion  of  a 
phase  point  near  the  attracting  set  will  involve  chaotic,  unpredictable  changes 
of  “phase”  of  the  limiting  behavior,  perceptible  as  “stochasticity”  or 
“turbulence.” 

Unfortunately,  no  convincing  analysis  from  this  point  of  view  has  yet 
been  developed  for  physical  examples  of  a  turbulent  character.  A  primary 
example  is  the  hydrodynamic  instability  of  a  viscous  fluid,  described  by  the 
so-called  Navier-Stokes  equations.  The  phase  space  of  this  problem  is 
infinite-dimensional  (it  is  the  space  of  vector  fields  with  divergence  0  in  the 
domain  of  fluid  flow),  but  the  infinite-dimensionality  of  the  problem  is 
apparently  not  a  serious  obstacle,  since  the  viscosity  extinguishes  the  high 
harmonics  (small  vortices)  faster  and  faster  as  the  harmonics  are  higher  and 
higher.  As  a  result,  the  phase  curves  from  the  infinite-dimensional  space 
seem  to  approach  some  finite-dimensional  manifold  (or  set),  to  which  the 
limit  regime  also  belongs. 
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For  large  viscosity,  we  have  a  stable  attracting  equilibrium  position  in  the 
phase  space  (“stable  stationary  flow”).  As  the  viscosity  decreases  it  loses  sta¬ 
bility;  for  example,  a  stable  limit  cycle  can  appear  in  phase  space  (“periodic 
flow”)  or  a  stable  equilibrium  position  of  a  new  type  (“secondary  stationary 
flow”). 96  As  the  viscosity  decreases  further,  more  and  more  harmonics  come 
into  play,  and  the  limit  regime  can  become  ever  higher  in  dimension. 

For  small  viscosity,  the  approach  to  a  limit  regime  with  exponentially 
unstable  trajectories  seems  very  likely.  Unfortunately,  the  corresponding 
calculations  have  not  yet  been  carried  out  due  to  the  limited  capacity  of 
existing  computers.  However,  the  following  general  conclusion  can  be  drawn 
without  any  calculations :  turbulent  phenomena  may  appear  even  if  solutions 
exist  and  are  unique;  exponential  instability,  which  is  encountered  even  in 
deterministic  systems  with  a  finite  number  of  degrees  of  freedom,  is  sufficient. 

As  one  more  example  of  an  application  of  exponential  instability  we  men¬ 
tion  the  proof  announced  by  Ya.  G.  Sinai  of  the  “ergodic  hypothesis”  of 
Boltzmann  for  systems  of  rigid  balls.  The  hypothesis  is  that  the  phase  flow 
corresponding  to  the  motion  of  identical  absolutely  elastic  balls  in  a  box  with 
elastic  walls  is  ergodic  on  connected  energy  level  sets.  (Ergodicity  means  that 
almost  every  phase  curve  spends  an  amount  of  time  in  every  measurable 
piece  of  the  level  set  proportional  to  the  measure  of  that  piece.) 

Boltzmann’s  hypothesis  allows  us  to  replace  time  averages  by  space 
averages,  and  was  for  a  long  time  considered  to  be  necessary  to  justify 
statistical  mechanics.  In  reality,  Boltzmann’s  hypothesis  (in  which  it  is  a 
question  of  a  limit  as  time  approaches  infinity)  is  not  necessary  for  passing 
to  the  statistical  limit  (the  number  of  pieces  approaches  infinity).  However, 
Boltzmann’s  hypothesis  inspired  the  entire  analysis  of  the  stochastic  proper¬ 
ties  of  dynamical  systems  (so-called  ergodic  theory),  and  its  proof  serves  as  a 
measure  of  the  maturity  of  this  theory. 

The  exponential  instability  of  trajectories  in  Boltzmann’s  problem  arises 
as  a  result  of  collisions  of  the  balls  with  one  another,  and  can  be  explained 
in  the  following  way.  For  simplicity,  we  will  consider  a  system  of  only  two 
particles  in  the  plane,  and  will  represent  a  square  box  with  reflection  off  the 
walls  by  the  planar  torus  {(x,  y)mod  1}.  Then  we  can  consider  one  of  the  par¬ 
ticles  as  stationary  (using  the  conservation  of  momentum);  the  other  particle 
can  be  considered  as  a  point. 

In  this  way  we  arrive  at  the  model  problem  of  motion  of  a  point  on  a  toral 
billiard  table  with  a  circular  wall  in  the  middle  from  which  the  point  is  re¬ 
flected  according  to  the  law  “the  angle  of  incidence  is  equal  to  the  angle  of 
reflection”  (Figure  235). 

To  investigate  this  system  we  look  at  an  analogous  billiard  table  bounded 
on  the  outside  by  a  planar  convex  curve  (e.g.,  the  motion  of  a  point  inside  an 
ellipse).  Motion  on  such  a  billiard  table  can  be  considered  as  the  limiting 
case  of  the  geodesic  flow  on  the  surface  of  an  ellipsoid.  Passage  to  the  limit 

96  A  more  detailed  account  of  loss  of  stability  is  given  in  “Lectures  on  bifurcations  and  versal 
families,”  Russian  Math.  Surveys  27,  no.  5  (1972),  55-123. 
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Figure  235  Torus-shaped  billiard  table  with  scattering  by  a  circular  wall 

consists  of  decreasing  the  smallest  axis  of  the  ellipsoid  to  zero.  As  a  result, 
geodesics  on  the  ellipsoid  become  billiard  trajectories  on  the  ellipse.  We 
discover  from  this  that  the  ellipse  can  reasonably  be  thought  of  as  two-sided 
and  that,  under  every  reflection,  the  geodesic  goes  from  one  side  of  the  ellipse 
to  the  other. 

We  now  return  to  our  toral  billiard  table.  Motion  on  it  can  be  looked  at  as 
the  limiting  case  of  the  geodesic  flow  on  a  smooth  surface.  This  surface  is 
obtained  from  looking  at  the  torus  with  a  hole  as  a  two-sided  surface,  giving 
it  some  thickness  and  slightly  smoothing  the  sharp  edge.  As  a  result  we  have  a 
surface  with  the  topology  of  a  pretzel  (a  sphere  with  two  handles). 

After  blowing  up  the  ellipse  into  the  ellipsoid  we  obtain  a  surface  of 
positive  curvature;  after  blowing  up  the  torus  with  a  hole  we  get  a  surface  of 
negative  curvature  (in  both  cases  the  curvature  is  concentrated  close  to  the 
edge,  but  the  blowing  up  can  be  done  so  that  the  sign  of  the  curvature  does 
not  change).  Thus  motion  in  our  toral  billiard  table  can  be  looked  at  as  the 
limiting  case  of  motion  along  geodesics  on  a  surface  of  negative  curvature. 

Now,  to  prove  Boltzmann’s  conjecture  (in  the  simple  case  under  con¬ 
sideration)  it  is  sufficient  to  verify  that  the  analysis  of  stochastic  properties 
of  geodesic  flows  on  surfaces  of  negative  curvature  holds  in  the  indicated 
limiting  case. 

A  more  detailed  presentation  of  the  proof  turns  out  to  be  very  complicated ; 
it  has  been  published  only  for  the  case  of  systems  of  two  particles  (Ya.  G. 
Sinai,  Dynamical  systems  with  elastic  reflections,  Russian  Mathematical 
Surveys,  25,  no.  2  (1970),  137-189). 
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Eulerian  motion  of  a  rigid  body  can  be  described  as  motion  along  geodesics 
in  the  group  of  rotations  of  three-dimensional  euclidean  space  provided  with 
a  left-invariant  riemannian  metric.  A  significant  part  of  Euler’s  theory 
depends  only  upon  this  invariance,  and  therefore  can  be  extended  to  other 
groups. 

Among  the  examples  involving  such  a  generalized  Euler  theory  are  motion 
of  a  rigid  body  in  a  high-dimensional  space  and,  especially  interesting,  the 
hydrodynamics  of  an  ideal  (incompressible  and  inviscid)  fluid.  In  the 
latter  case,  the  relevant  group  is  the  group  of  volume-preserving  diffeo- 
morphisms  of  the  domain  of  fluid  flow.  In  this  example,  the  principle  of  least 
action  implies  that  the  motion  of  the  fluid  is  described  by  the  geodesics  in  the 
metric  given  by  the  kinetic  energy.  (If  we  wish,  we  can  take  this  principle  to  be 
the  mathematical  definition  of  an  ideal  fluid.)  It  is  easy  to  verify  that  this 
metric  is  (right)  invariant. 

Of  course,  extending  results  obtained  for  finite-dimensional  Lie  groups 
to  the  infinite-dimensional  case  should  be  done  with  care.  For  example,  in 
three-dimensional  hydrodynamics  an  existence  and  uniqueness  theorem  for 
solutions  of  the  equations  of  motion  has  not  yet  been  proved.  Nevertheless, 
it  is  interesting  to  see  what  conclusions  can  be  drawn  by  formally  carrying 
over  properties  of  geodesics  on  finite-dimensional  Lie  groups  to  the  infinite¬ 
dimensional  case.  These  conclusions  take  the  character  of  a  priori  statements 
(identities,  inequalities,  etc.)  which  should  be  satisfied  by  all  reasonable 
solutions.  In  some  cases,  the  formal  conclusions  can  then  be  rigorously 
justified  directly,  without  infinite-dimensional  analysis. 

For  example,  the  Euler  equations  of  motion  for  a  rigid  body  have  as  their 
analogue  in  hydrodynamics  the  Euler  equations  of  motion  of  an  ideal  fluid. 
Euler’s  theorem  on  the  stability  of  rotations  around  the  large  and  small  axes 
of  the  inertia  ellipsoid  corresponds  in  hydrodynamics  to  a  slight  generaliza¬ 
tion  of  Rayleigh’s  theorem  on  the  stability  of  flows  without  inflection  points 
of  the  velocity  profile. 

It  is  also  easy  to  extract  from  Euler’s  formulas  an  explicit  expression  for 
the  riemannian  curvature  of  a  group  with  a  one-sided  invariant  metric. 
Applying  this  to  hydrodynamics  we  find  the  curvature  of  the  group  of  dif- 
feomorphisms  preserving  the  volume  element.  It  is  interesting  to  note  that  in 
sufficiently  nice  two-dimensional  directions,  the  curvature  turns  out  to  be 
finite  and,  in  many  cases,  negative.  Negative  curvature  implies  exponential 
instability  of  geodesics  (cf.  Appendix  1).  In  the  case  under  consideration,  the 
geodesics  are  motions  of  an  ideal  fluid;  therefore  the  calculation  of  the 
curvature  of  the  group  of  diffeomorphisms  gives  us  some  information  on  the 
instability  of  ideal  fluid  flow.  In  fact,  the  curvature  determines  the  character¬ 
istic  path  length  on  which  differences  between  initial  conditions  grow  by  e. 
Negative  curvature  leads  to  practical  indeterminacy  of  the  flow:  on  a  path 
only  a  few  times  longer  than  the  characteristic  path  length,  a  deviation  in 
initial  conditions  grows  100  times  larger. 
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In  this  appendix,  we  will  briefly  set  out  the  results  of  calculations  related 
to  geodesics  on  groups  with  one-sided  (right-  or  left-)  invariant  metrics. 
Proofs  and  further  details  can  be  found  in  the  following  places: 

V.  Arnold,  Sur  la  geometrie  differentielle  des  groupes  de  Lie  de  dimension  infinie  et  ses  applica¬ 
tions  a  1’hydrodynamique  des  fluides  parfaits.  Annales  de  I'lnstitut  Fourier,  XVI,  no.  1 
(1966),  319-361. 

V.  I.  Arnold,  An  a  priori  estimate  in  the  theory  of  hydrodynamic  stability,  Izv.  Vyssh.  Uchebn. 
Zaved.  Matematicka  1966,  no.  5  (54),  3-5.  (Russian) 

V.  I.  Arnold,  The  Hamiltonian  nature  of  the  Euler  equations  in  the  dynamics  of  a  rigid  body  and 
of  an  ideal  fluid,  Uspekhi  Matematischeskikh  Nauk,  24  (1969),  no.  3  (147)  225-226. 
(Russian) 

L.  A.  Dikii,  A  remark  on  Hamiltonian  systems  connected  with  the  rotation  group,  Functional 
Analysis  and  Its  Applications,  6:4  (1972)  326-327. 

D.  G.  Ebin,  J.  Marsdcn,  Groups  of  diffeomorphisms  and  the  motion  of  an  incompressible  fluid, 
Annals  of  Math.  92,  no.  1  (1970),  102-163. 

O.  A.  Ladyzhenskaya,  On  the  local  solvability  of  non-stationary  problems  for  incompressible 
ideal  and  viscous  fluids  and  vanishing  viscosity,  Boundary  problems  in  mathematical 
physics,  v.  5  (Zapiski  nauchnikh  seminarov  LOMI,  v.  21),  “Nauka,”  1971,  65-78.  (Russian) 

A.  S.  Mishchenko,  Integrals  of  geodesic  flows  on  Lie  groups,  Functional  Analysis  and  Its  Ap¬ 
plications,  4,  no.  3  (1970),  232-235. 

A.  M.  Obukhov,  On  integral  invariants  in  systems  of  hydrodynamic  type,  Doklady  Acad.  Nauk. 
184,  no.  2  (1969).  (Russian) 

L.  D.  Faddeev,  Towards  a  stability  theory  of  stationary  planar-parallel  flows  of  an  ideal  fluid, 
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A  Notation :  The  adjoint  and  co-adjoint  representations 

Let  G  be  a  real  Lie  group  and  g  its  Lie  algebra,  i.e.,  the  tangent  space  to  the 
group  at  the  identity  provided  with  the  commutator  bracket  operation 
[  ,  ]• 

A  Lie  group  acts  on  itself  by  left  and  right  translation :  every  element  g 
of  the  group  G  defines  diffeomorphisms  of  the  group  onto  itself : 

Lgi  G  —*  G  Lgh  —  gh  Rg  =  G  —*  G  Rgh  =  hQ- 

The  induced  maps  of  the  tangent  spaces  will  be  denoted  by 

L9+:  TGh->  TGgh  and  i?9+:  TGh  -►  TGhg 

for  every  h  in  G. 

The  diffeomorphism  Rg-iLg  is  an  inner  automorphism  of  the  group.  It 
leaves  the  group  identity  element  fixed.  Its  derivative  at  the  identity  is  a 
linear  map  from  the  algebra  (i.e.,  the  tangent  space  to  the  group  at  the 
identity)  to  itself.  This  map  is  denoted  by 

Adg-  g  ►  g  Adg  (.Rg~ 1  Tg)^e 
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and  is  called  the  adjoint  representation  of  the  group.  It  is  easy  to  verify  that 
Adg  is  an  algebra  homomorphism,  i.e.,  that 

=  {.Ad9£,  Adgrf\,  &  g  e  9. 

It  is  also  clear  that  Adgh  =  Adg  Adh . 

We  can  consider  Ad  as  a  map  of  the  group  into  the  space  of  linear  operators 
on  the  algebra: 


Ad(g)  =  Adg. 

The  map  Ad  is  differentiable.  Its  derivative  at  the  identity  of  the  group  is  a 
linear  map  from  the  algebra  g  to  the  space  of  linear  operations  on  g.  This 
map  is  denoted  by  ad,  and  its  image  on  an  element  £  in  the  algebra  by  ad$. 
Thus  ad$  is  an  endomorphism  of  the  algebra  space,  and  we  have 


ad  =  Ad*e  :  g  ->  End  g 


ad$ 


d_ 

dt 


Adet$ , 

r  =  0 


where  e ^  is  the  one-parameter  group  with  tangent  vector  From  the  formula 
written  above  it  is  easy  to  deduce  an  expression  for  ad  in  terms  of  the  algebra 
alone : 


ad^rj  =  [£  g~\. 

We  now  consider  the  dual  vector  space  g*  to  the  Lie  algebra  g.  This  is 
the  space  of  real  linear  functionals  on  the  Lie  algebra.  In  other  words,  g* 
is  the  cotangent  space  to  the  group  at  the  identity,  g*  =  T*Ge.  The  value 
of  an  element  £  of  the  cotangent  space  to  the  group  at  some  point  g  on  an 
element  g  of  the  tangent  space  at  the  same  point  will  be  denoted  by  round 
brackets : 


(^)eR,  i;eT*Gg,geTGg. 

Left  and  right  translation  induce  operators  on  the  cotangent  space  dual 
to  Lgif  and  Rg1f .  We  denote  them  by 

L*:  T*Ggh  ->  T*Gh  and  R*:  T*Ghg  ->  T*Gh 

for  every  h  in  G.  These  operators  are  defined  by  the  identities 

(L*£,  rfi  =  (£,  Lg*  17)  and  (K*£,  g)  =  (&  Rg *  g). 

The  transpose  operators  Ad*,  where  g  runs  through  the  Lie  group  G,  form 
a  representation  of  this  group,  i.e.,  they  satisfy  the  relations 

Ad*h  =  AdtAd*g. 

This  representation  is  called  the  co-adjoint  representation  of  the  group  and 
plays  an  important  role  in  all  questions  related  to  (left)  invariant  metrics  on 
the  group. 

Consider  the  derivative  of  the  operator  Ad*  with  respect  to  g  at  the  identity. 
This  derivative  is  a  linear  map  from  the  algebra  to  the  space  of  linear  operators 
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on  the  dual  space  to  the  algebra.  This  linear  map  is  denoted  by  ad*,  and  its 
image  on  an  element  £  in  the  algebra  is  denoted  by  ad*.  Thus  ad*  is  a  linear 
operator  on  the  dual  space  to  the  algebra, 

adf.  g*  -►  g*. 

It  is  easy  to  see  that  ad*  is  the  adjoint  of  ad 

(adfri,  0  =  (tj,  ads 0  for  all  g  g*,  £  g  g. 

It  is  sometimes  convenient  to  denote  the  action  of  ad*  by  braces: 

ad*rj  =  {£,  f/},  where  £  e  g,  rj  e  g*. 

Thus  braces  mean  the  bilinear  function  from  g  x  g*  to  g*,  related  to  com¬ 
mutation  in  the  algebra  by  the  identity 

({£,  V},  0  =  (»7,  K,  G). 

We  consider  now  the  orbits  of  the  co-adjoint  representation  of  the  group 
in  the  dual  space  of  the  algebra.  At  each  point  of  an  orbit  we  have  a  natural 
symplectic  structure  (called  the  Kirillov  form  since  A.  A.  Kirillov  first  used  it 
to  investigate  representations  of  nilpotent  Lie  groups).  Thus,  the  orbits  of 
the  co-adjoint  representation  are  always  even-dimensional.  We  also  note 
that  we  obtain  a  series  of  examples  of  symplectic  manifolds  by  looking  at 
different  Lie  groups  and  all  possible  orbits. 

The  symplectic  structure  on  the  orbits  of  the  co-adjoint  representation  is 
defined  by  the  following  construction.  Let  x  be  a  point  in  the  dual  space  to 
the  algebra  and  £  a  vector  tangent  at  this  point  to  its  orbit.  Since  g*  is  a 
vector  space,  we  can  consider  the  vector  £,  which  really  belongs  to  the  tangent 
space  to  g*  at  x,  as  lying  in  g*. 

The  vector  £  can  be  represented  (in  many  ways)  as  the  velocity  vector  of 
the  motion  of  the  point  x  under  the  co-adjoint  action  of  the  one-parameter 
group  eat  with  velocity  vector  a  e  g.  In  other  words,  every  vector  tangent  to 
the  orbit  of  x  in  the  co-adjoint  representation  of  the  group  can  be  expressed 
in  terms  of  a  suitable  vector  a  in  the  algebra  by  the  formula 

£  -  {a,  x},  a  6  g,  x  e  g*. 

Now  we  are  ready  to  define  the  value  of  the  symplectic  2-form  ft  on  a  pair 
of  vectors  £u  £2  tangent  to  the  orbit  of  x.  Namely,  we  express  £t  and  £2  in 
terms  of  algebra  elements  at  and  a2  by  the  formula  above,  and  then  obtain 
the  scalar 

£2)  =  (*.  [«i.  «2]X  xGg*,  a^g. 

It  is  easy  to  verify  that  (1)  the  bilinear  form  ft  is  well  defined,  i.e.,  its  value  does 
not  depend  on  the  choice  of  at;  (2)  is  skew-symmetric  and  therefore  gives 
a  differential  2-form  Q  on  the  orbit;  and  (3)  ft  is  nondegenerate  and  closed 
(the  proofs  can  be  found,  for  instance,  in  Appendix  5).  Thus  the  form  ft  is  a 
symplectic  structure  on  an  orbit  of  the  co-adjoint  representation. 
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B  Left-invariant  metrics 

A  riemannian  metric  on  a  Lie  group  G  is  called  left-invariant  if  it  is  preserved 
by  all  left  translations  Lg,  i.e.,  if  the  derivative  of  left  translation  carries  every 
vector  to  a  vector  of  the  same  length. 

It  is  sufficient  to  give  a  left-invariant  metric  at  one  point  of  the  group,  for 
instance  the  identity;  then  the  metric  can  be  carried  to  the  remaining  points 
by  left  translations.  Thus  there  are  as  many  left-invariant  riemannian  metrics 
on  a  group  as  there  are  euclidean  structures  on  the  algebra. 

A  euclidean  structure  on  the  algebra  is  defined  by  a  symmetric  positive 
definite  operator  from  the  algebra  to  its  dual  space.  Thus,  let  A:q  -»  g*  be 
a  symmetric  positive  linear  operator: 

(Af  rj)  =  ( Arj ,  £),  for  all  £,  rj  in  g. 

(It  is  not  very  important  that  A  be  positive,  but  in  mechanical  applications 
the  quadratic  form  (Af  £)  is  positive  definite.) 

We  define  a  symmetric  operator  Ag:  TGg->  T*Gg  by  left  translation: 

Ag£  =  L*-tALg-l*c. 

We  thus  obtain  the  following  commutative  diagram  of  linear  operators: 


Ad,, 


Ad* 

We  will  denote  by  angled  brackets  the  scalar  product  determined  by  the 
operator  Ag : 


<&  n >g  =  (A&  rj)  =  (Agtj,  0  =  <»/,  Og- 

This  scalar  product  gives  a  riemannian  metric  on  the  group  G,  invariant  under 
left  translations.  The  scalar  product  in  the  algebra  will  be  denoted  simply  by 
<  ,  ).  We  define  an  operation  B  :  g  x  g  -*  g  by  the  identity 

<[a,  b],  c>  =  (B(c,  a),  b >,  for  all  b  in  g. 

Clearly,  this  operation  B  is  bilinear,  and  for  fixed  first  argument  is  skew- 
symmetric  in  the  second: 

(B(c,  a),  b}  +  < B(c ,  b ),  a)  =  0. 
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C  Example 

Let  G  =  SO(3)  be  the  group  of  rotations  of  three-dimensional  euclidean 
space,  i.e.  the  configuration  space  of  a  rigid  body  fixed  at  a  point.  A  motion 
of  the  body  is  then  described  by  a  curve  g  =  g(t )  in  the  group.  The  Lie  algebra 
of  G  is  the  three-dimensional  space  of  angular  velocities  of  all  possible 
rotations.  The  commutator  in  this  algebra  is  the  usual  vector  product. 

A  rotation  velocity  g  of  the  body  is  a  tangent  vector  to  the  group  at  the 
point  g.  To  get  the  angular  velocity,  we  must  carry  this  vector  to  the  tangent 
space  of  the  group  at  the  identity,  i.e.  to  the  algebra.  But  this  can  be  done  in 
two  ways:  by  left  and  right  translation.  As  a  result,  we  obtain  two  different 
vectors  in  the  algebra: 

aic  =  Eg-lmgeQ  and  "*  =  K«r‘*0e9- 

These  two  vectors  are  none  other  than  the  “angular  velocity  in  the  body”  and 
the  “angular  velocity  in  space.” 


An  element  g  of  the  group  G  corresponds  to  a  position  of  the  body  obtained  by  the  motion  g 
from  some  initial  state  (corresponding  to  the  identity  element  of  the  group  and  chosen  abritrar- 
ily).  Let  w  be  an  element  of  the  algebra. 

Let  em  be  a  one-parameter  group  of  rotations  with  angular  velocity  a>;  a>  is  the  tangent 
vector  to  this  one-parameter  group  at  the  identity.  Now  we  look  at  the  displacement 

e‘”g,  where  g  =  g(t)  e  G,  a>  e  g,  and  r  <?  I, 

obtained  from  the  displacement  g  by  a  rotation  with  angular  velocity  <x>  after  a  small  time  t. 
If  the  vector  g  coincides  with  the  vector 


then  a)  is  called  the  angular  velocity  relative  to  space  and  is  denoted  by  cus.  Thus  cos  is  obtained 
from  g  by  right  translation.  In  an  analogous  way  we  can  show  that  the  angular  velocity  in 
the  body  is  the  left  translate  of  the  vector  g  in  the  algebra. 


The  dual  space  g*  to  the  algebra  in  our  example  is  the  space  of  angular 
momenta. 

The  kinetic  energy  of  a  body  is  determined  by  the  vector  of  angular  velocity 
in  the  body  and  does  not  depend  on  the  position  of  the  body  in  space.  There¬ 
fore,  kinetic  energy  gives  a  left-invariant  riemannian  metric  on  the  group. 
The  symmetric  positive  definite  operator  Ag:TGg~*  T*Gg  given  by  this 
metric  is  called  the  moment  of  inertia  operator  (or  tensor).  It  is  related  to  the 
kinetic  energy  by  the  formula  T  =  j(g,  g}g  =  i<cof,  wc)  =  j(Aa>c,  coc)  = 
j(Agg,  g),  where  A  :  g  ->  g*  is  the  value  of  Ag  for  g  =  e.  The  image  of  the 
vector  g  under  the  action  of  the  moment  of  inertia  operator  Ag  is  called  the 
angular  momentum  and  is  denoted  by  M  =  Agg.  The  vector  M  lies  in  the 
cotangent  space  to  the  group  at  the  point  g,  and  it  can  be  carried  to  the  co¬ 
tangent  space  to  the  group  at  the  identity  by  both  left  and  right  translations. 
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We  obtain  two  vectors 

Mc  =  L*M  g  g* 

and 

M s  =  R*M  g  9* 

These  vectors  in  the  dual  space  to  the  algebra  are  none  other  than  the 
angular  momentum  relative  to  the  body  ( Mc )  and  the  angular  momentum 
relative  to  space  (Ms).  This  follows  easily  from  the  expression  for  kinetic 
energy  in  terms  of  momentum  and  angular  velocity: 

T  =  KMf,  wc)  =  KM,  g). 

By  the  principle  of  least  action,  the  motion  of  a  rigid  body  under  inertia 
(with  no  external  forces)  is  a  geodesic  in  the  group  of  rotations  with  the  left- 
invariant  metric  described  above. 

We  will  now  look  at  a  geodesic  of  an  arbitrary  left-invariant  riemannian 
metric  on  an  arbitrary  Lie  group  as  a  motion  of  a  “generalized  rigid  body” 
with  configuration  space  G.  Such  a  “rigid  body  with  group  G"  is  determined 
by  its  kinetic  energy,  i.e.,  a  positive  definite  quadratic  form  on  the  Lie  algebra. 
More  precisely,  we  will  consider  geodesics  of  a  left-invariant  metric  on  a 
group  G  given  by  a  quadratic  form  (to,  to}  on  the  algebra  as  motions  of  a 
rigid  body  with  group  G  and  kinetic  energy  (to,  to)/2. 

To  every  motion  t  g(t)  of  our  generalized  rigid  body  we  can  associate 
four  curves: 

t  -»  COc(f)  G  9  t  ^  tos(t)  G  9 
t  -»  Mc(t)  g  g*  t  -►  Ms(f)e  9*, 

called  motions  of  the  vectors  of  angular  velocity  and  momentum  in  the  body 
and  in  space.  The  differential  equations  which  these  curves  satisfy  were  found 
by  Euler  for  an  ordinary  rigid  body.  However,  they  are  true  in  the  most  general 
case  of  an  arbitrary  group  G,  and  we  will  call  them  the  Euler  equations  for  a 
generalized  rigid  body. 

Remark.  In  the  ordinary  theory  of  a  rigid  body  six  different  three-dimen¬ 
sional  spaces  R3,  R3*,  g,  g*,  TGg,  and  T*Gg  are  identified.  The  fact  that  the 
dimensions  of  the  space  R3  in  which  the  body  moves  and  of  the  Lie  algebra  g 
of  its  group  of  motions  are  the  same  is  an  accident  related  to  the  dimension  3 ; 
in  the  n-dimensional  case,  g  has  dimension  n(n  -  l)/2. 

The  identification  of  the  Lie  algebra  g  with  its  dual  space  g*  has  a  more 
profound  basis.  The  fact  is  that  on  the  group  of  rotations  there  exists  (and  is 
unique  up  to  multiplication)  a  two-sided  invariant  riemannian  metric.  This 
metric  gives  once  and  for  all  a  preferred  isomorphism  of  the  vector  spaces  g 
and  g*  (and  also  of  TGg  and  T*Gg).  It  allows  us  therefore  to  consider  the 
vectors  of  angular  velocity  and  momentum  as  lying  in  the  same  euclidean 
space.  With  this  identification,  the  operation  {  ,  }  is  simply  the  commutator 
of  the  algebra,  taken  with  a  minus  sign. 
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A  two-sided  invariant  metric  exists  on  any  compact  Lie  group.  Therefore, 
to  study  motions  of  rigid  bodies  with  compact  groups  we  may  identify  the 
spaces  of  angular  velocities  and  momenta.  However,  we  cannot  make  this 
identification  for  applications  to  non-compact  (or  infinite-dimensional) 
groups  of  dilTeomorphisms. 

D  Euler's  equation 

The  results  of  Euler  (obtained  by  him  in  the  particular  case  G  =  S0(3))  can 
be  formulated  as  the  following  theorems  on  the  motion  of  the  vectors  of 
angular  velocity  and  momentum  of  a  generalized  rigid  body  with  group  G. 

Theorem  1.  The  vector  of  angular  momentum  relative  to  space  is  preserved 
under  motion : 


dMs 

dt 


=  0. 


Theorem  2.  The  vector  of  angular  momentum  relative  to  the  body  satisfies 
Eulers  equation 


dMc 


dt 


{roc,  Mc}. 


These  theorems  are  proved  for  a  generalized  rigid  body  in  the  same  way  as 
for  an  ordinary  rigid  body. 

Remark  l.  The  vector  of  angular  velocity  in  the  body,  coc,  can  be  expressed 
linearly  in  terms  of  the  vector  of  angular  momentum  in  the  body,  Mc ,  by 
using  the  inverse  of  the  inertia  operator:  a>c  =  A~1MC.  Therefore,  Euler’s 
equation  can  be  considered  as  an  equation  for  the  vector  of  angular  mo¬ 
mentum  in  the  body  alone;  its  right-hand  side  is  quadratic  in  Mc. 

We  can  also  express  this  result  in  the  following  way.  Consider  the  phase 
flow  of  our  rigid  body.  (Its  phase  space  T*G  has  dimension  twice  the  dimen¬ 
sion  n  of  the  group  G  or  the  space  of  angular  momenta  g*.)  Then  this  phase 
flow  in  a  2n-dimensional  manifold  factors  over  the  flow  given  by  Euler’s 
equation  in  the  n-dimensional  vector  space  g*. 


A  factorization  of  a  phase  flow  g'  on  a  manifold  X  over  a  phase  flow  /'  on  a  manifold  Y 
is  a  smooth  mapping  7t  of  X  onto  Y  under  which  motions  g'  are  mapped  to  motions/',  so  that 
the  following  diagram  commutes  (i.e.,  Kg'  =  f‘n): 


n 


n 


Y 


Y 
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In  our  case,  X  =  T*G  is  the  phase  space  of  the  body,  Y  =  3*  is  the  space  of  angular  momenta. 
The  projection  k:  T*G  -*  3*  is  defined  by  left  translation  (7 iM  =  L*M  for  M  e  T*Gg),  g'  is 
the  phase  flow  of  the  body  under  consideration  on  the  2n-dimensional  space  T*G,  and  /'  is  the 
phase  flow  of  the  Euler  equation  in  the  n-dimension  space  of  angular  momenta  3*. 


In  other  words,  a  motion  of  the  vector  of  angular  momentum  relative  to 
the  body  depends  only  on  the  initial  position  of  the  vector  of  angular  mo¬ 
mentum  relative  to  the  body  and  does  not  depend  on  the  position  of  the 
body  in  the  space. 

Remark  2.  The  law  of  conservation  of  the  vector  of  angular  momentum 
relative  to  space  can  be  expressed  by  saying  that  every  component  of  this 
vector  in  some  coordinate  system  on  the  space  9*  is  conserved.  We  thus 
obtain  a  set  of  first  integrals  of  the  equations  of  motion  of  the  rigid  body.  In 
particular,  to  every  element  of  the  Lie  algebra  g  there  corresponds  a  linear 
function  on  the  space  g*  and,  therefore,  a  first  integral.  The  Poisson  brackets 
of  first  integrals  given  by  functions  on  g*  are  themselves  functions  on  g*,  as 
can  be  seen  easily.  We  thus  obtain  an  (infinite-dimensional)  extension  of  the 
Lie  algebra  g,  consisting  of  all  functions  on  g*.  g  itself  is  included  in  this 
extension  as  the  Lie  algebra  of  linear  functions  on  g*.  Of  course,  of  all  these 
first  integrals  of  the  phase  flow  in  a  2n-dimensional  space  only  n  are  func¬ 
tionally  independent.  As  the  n  independent  integrals  we  can  take,  for  example, 
n  linear  functions  on  g*  which  form  a  basis  in  g. 

Because  of  possible  infinite-dimensional  applications,  we  would  like  to 
avoid  coordinates  and  formulate  statements  about  first  integrals  intrinsically. 
This  can  be  done  by  reformulating  Theorem  1  in  the  following  way. 


Theorem  3.  The  orbits  of  the  co-adjoint  representation  of  a  group  in  the  dual 
space  to  the  algebra  are  invariant  manifolds  for  the  flow  in  this  space  given 
by  Eulers  equation. 

Proof.  Mff)  is  obtained  from  Mff)  by  the  action  of  the  co-adjoint  repre¬ 
sentation,  and  Ms(t)  remains  fixed.  □ 

Example.  In  the  case  of  an  ordinary  rigid  body,  the  orbits  of  the  co-adjoint 
representation  of  the  group  in  the  space  of  momenta  are  the  spheres 
M*  +  Ml  +  Mj  =  const.  In  this  case  Theorem  3  is  reduced  to  the  law  of 
conservation  of  the  length  of  the  angular  momentum.  It  consists  of  the  fact 
that,  if  the  initial  point  Mc  lies  on  some  orbit  (i.e.,  in  the  given  case  on  the 
sphere  M2  =  const),  then  all  the  points  of  its  trajectory  under  the  action  of 
Euler’s  equation  lie  on  the  same  orbit. 

We  now  return  to  the  general  case  of  an  arbitrary  group  G  and  recall  that 
each  orbit  of  the  co-adjoint  representation  has  a  symplectic  structure  (cf. 
subsection  A).  Furthermore,  the  kinetic  energy  of  the  body  can  be  expressed 
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in  terms  of  the  angular  momentum  relative  to  the  body.  As  a  result  we  obtain 
a  quadratic  form  on  the  space  of  angular  momenta 

r  =  KMc,^-1Mc). 

Let  us  fix  some  one  orbit  V  of  the  co-adjoint  representation.  We  consider  the 
kinetic  energy  as  a  function  on  this  orbit: 

H :  V  ->  R,  H(MC )  =  j(Me,  A ~xMe). 

Theorem  4.  On  every  orbit  V  of  the  co-adjoint  representation,  Euler's  equation 
is  hamiltonian  with  hamiltonian  function  H. 


Proof.  Every  vector  c  tangent  to  V  at  a  point  M  has  the  form  c  =  {f  M}.  where  /'eg.  In 
particular,  the  vector  field  on  the  right  side  of  Euler's  equation  can  be  written  in  the  form 
X  =  { dT ,  M}  (here  the  differential  of  the  function  T  at  a  point  M  of  the  vector  space  3*  is 
considered  as  a  vector  of  the  dual  space  to  3*,  i.e.,  as  an  element  of  the  Lie  algebra  3).  It  follows 
from  the  definitions  of  the  symplectic  structure  ft  and  the  operation  {  ,  [  (cf.  subsection  A) 
that  for  every  vector  c  tangent  to  V  at  M, 

E>(c,  X)  =  (M,  [/,  dL])  =  (dT,  {/,  M})  =  (dH,  g).  □ 

Euler’s  equation  can  be  carried  over  from  the  dual  space  of  the  algebra  to 
the  algebra  itself  by  inversion  of  the  moment  of  inertia  operator.  As  a  result 
we  obtain  the  following  formulation  of  Euler’s  equation  in  terms  of  the 
operation  B  (section  B). 

Theorem  5.  The  motion  of  the  vector  of  angular  velocity  in  the  body  is  deter¬ 
mined  by  the  initial  position  of  this  vector  and  does  not  depend  on  the  initial 
position  of  the  body.  The  vector  of  angular  velocity  in  the  body  satisfies  an 
equation  with  quadratic  right-hand  side: 

(bc  -  B(a)c,  (Dc). 

We  will  call  this  equation  Euler’s  equation  for  angular  velocity.  We 
notice  that,  under  the  action  of  the  operator  A~ 1 :  g*  g,  the  orbits  of  the 
co-adjoint  representation  are  carried  to  invariant  manifolds  of  Euler’s 
equation  for  angular  velocity;  these  manifolds  have  symplectic  structure,  etc. 
However,  unlike  orbits  in  g*,  these  invariant  manifolds  are  not  determined 
by  the  Lie  group  G  itself,  but  depend  also  on  the  choice  of  rigid  body  (i.e., 
moment  of  inertia  operator). 

From  the  law  of  conservation  of  energy  we  have 


Theorem  6.  Euler's  equations  ( for  momentum  and  angular  velocity )  have  a 
quadratic  first  integral,  whose  value  is  equal  to  the  kinetic  energy 

T  =  ^(Mc,  A  ~  lMc )  =  ^(Aa)c,  wc). 
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E  Stationary  rotations  and  their  stability 

A  stationary  rotation  of  a  rigid  body  is  a  rotation  for  which  the  angular 
velocity  in  the  body  is  constant  (and  thus  also  the  angular  velocity  in  space; 
it  is  easy  to  see  that  one  implies  the  other).  We  know  from  the  theory  of  an 
ordinary  rigid  body  in  R3  that  stationary  rotations  are  rotations  around  the 
major  axes  of  the  moment  of  inertia  ellipsoid.  Below,  we  formulate  a  general¬ 
ization  of  this  theorem  to  the  case  of  a  rigid  body  with  any  Lie  group.  We  note 
that  stationary  rotations  are  geodesics  of  left-invariant  metrics  which  are  one- 
parameter  subgroups.  We  note  also  that  the  directions  of  the  major  axes  of 
the  inertia  ellipsoid  can  be  determined  by  looking  at  the  stationary  points  of 
the  kinetic  energy  on  the  sphere  of  vectors  of  momentum  of  fixed  length. 


Theorem  7.  The  angular  momentum  ( respectively ,  angular  velocity )  of  a 
stationary  rotation  with  respect  to  the  body  is  a  critical  point  of  the  energy 
on  the  orbit  of  the  co-adjoint  representation  ( respectively  on  the  image  of  the 
orbit  under  the  action  of  the  operator  A~ 1).  Conversely,  every  critical  point 
of  the  energy  on  an  orbit  determines  a  stationary  rotation. 

The  proof  is  a  straightforward  computation  or  application  of  Theorem  4. 
We  note  that  the  partition  of  the  space  of  momenta  into  orbits  of  the  co- 
adjoint  representation  cannot  be  so  easily  constructed  in  the  case  of  an 
arbitrary  group  as  it  was  in  the  simple  case  of  an  ordinary  rigid  body;  in  that 
case  it  was  the  partition  of  three-dimensional  space  into  spheres  with  center 
0  and  the  point  0  itself.  In  the  general  case,  the  orbits  can  have  different 
dimensions,  and  the  partition  into  orbits  at  some  points  may  not  be  a 
fibering;  such  a  singularity  already  appeared  in  the  three-dimensional  case 
at  the  point  0. 

We  call  a  point  M  of  the  space  of  angular  momenta  a  regular  point  if  the 
partition  of  a  neighborhood  of  M  into  orbits  is  diffeomorphic  to  a  partition 
of  euclidean  space  into  parallel  planes  (in  particular,  all  orbits  near  the  point 
M  have  the  same  dimension).  For  example,  for  the  group  of  rotations  of 
three-dimensional  space  all  points  of  the  space  of  angular  momenta  are 
regular  except  the  origin. 

Theorem  8.  Suppose  that  a  regular  point  M  of  the  space  of  angular  momenta  is 
a  critical  point  of  the  energy  on  an  orbit  of  the  co-adjoint  representation, 
and  that  the  second  differential  of  the  energy  d2H  at  this  point  is  a  ( positive 
or  negative)  definite  form.  Then  M  is  a  ( Liapunov )  stable  eguilibrium  position, 
of  Eulers  equations. 

Proof.  It  follows  from  the  regularity  of  the  orbits  near  this  point  that  on 
every  neighboring  orbit  there  exists  near  M  a  point  which  is  a  conditional 
maximum  or  minimum  of  energy.  Q 
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Theorem  9.  The  second  differential  of  the  kinetic  energy ,  restricted  to  the  image 
of  an  orbit  of  the  co-adjoint  representation  in  the  algebra ,  is  given  at  a 
critical  point  toe  g  by  the  formula 

2d2HL(i)  =  <B(to,f),  B(co,f)}  +  <[/  col  B(co,f)), 

where  C  is  a  tangent  vector  to  this  image ,  expressed  in  terms  of  f  by  the 
formula 

Z  =  B(co,f),  /eg. 


F  Riemannian  curvature  of  a  group  with 
left-invariant  metric 

Let  G  be  a  Lie  group  provided  with  the  left-invariant  metric  given  by  a 
scalar  product  <  ,  >  in  the  algebra.  We  note  that  the  riemannian  curvature 
of  the  group  G  at  any  point  is  determined  by  the  curvature  at  the  identity 
(since  left  translation  maps  the  group  to  itself  isometrically).  Therefore,  it  is 
sufficient  to  calculate  the  curvature  for  two-dimensional  planes  lying  in  the 
Lie  algebra. 

Theorem  10.  The  curvature  of  a  group  in  the  direction  determined  by  an 
orthonormal  pair  of  vectors  £,  rj  in  the  algebra  is  given  by  the  formula 

Ki,n  =  <A  <5>  +  2<a,  fiy  -  3<a,  a>  -  4 <B?,  Bfy, 

where  2d  =  B(£,  rj)  +  B(g,  0,  2f  =  B(H,  g)  -  B(g,  0,  2a  =  [£,  nl  2 B^  - 
B(f,  £),  2 Bn  —  B(rj,  g),  and  where  B  is  the  operation  defined  in  section  B. 

The  proof  is  a  tedious  but  straightforward  calculation.  It  is  based  on  the 
easily  verified  formula  for  covariant  derivative 

(V^)e  =  KK.  i]  ~  B(£,  tj)  -  B(g,  0), 

where  £  and  r\  on  the  left  are  left-invariant  vector  fields  and  on  the  right  are 
their  values  at  the  identity. 

Remark  1.  In  the  case  of  a  two-sided  invariant  metric,  the  formula  for 
curvature  has  the  particularly  simple  form 

if],  Df,  /?]>■ 

Remark  2.  The  formula  for  the  curvature  of  a  group  with  a  right-invariant 
riemannian  metric  coincides  with  the  formula  for  the  left-invariant  case.  In 
fact,  a  right-invariant  metric  on  a  group  is  a  left-invariant  metric  on  the 
group  with  the  reverse  multiplication  law  {gl  *  g2  =  g20i)-  Passage  to  the 
reverse  group  changes  the  signs  of  both  the  commutator  and  the  operation  B 
in  the  algebra.  But,  in  every  term  of  the  formula  for  curvature,  there  is  a 
product  of  two  operations  changing  the  sign.  Therefore,  the  formula  for 
curvature  is  the  same  in  the  right-invariant  case. 
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In  Euler’s  equation  the  right-hand  side  changes  sign  under  passage  to  the 
right-invariant  case. 

G  Application  to  groups  of  dijfeomorphisms 

Let  D  be  a  bounded  region  in  a  riemannian  manifold.  Consider  the  group  of 
difleomorphisms  of  D  which  preserve  the  volume  element.  We  will  denote 
this  group  by  SDiffD. 

The  Lie  algebra  corresponding  to  the  group  SDiffD  consists  of  all  vector 
fields  with  divergence  0  on  D,  tangent  to  the  boundary  (if  it  is  not  empty).  We 
define  the  scalar  product  of  two  elements  of  this  Lie  algebra  (i.e„  two  vector 
fields)  as 

<vt,  v2>  =  (»i  •  v2)dx, 

Jd 

where  (•)  is  the  scalar  product  giving  the  riemannian  metric  on  D,  and  dx 
is  the  riemannian  volume  element. 

We  now  consider  the  flow  of  a  uniform  ideal  (incompressible,  non- 
viscous)  fluid  on  the  region  D.  Such  a  flow  is  described  by  a  curve  t  -*  gt  in 
the  group  SDiffD.  Namely,  the  difleomorphism  gt  is  the  map  which  carries 
every  particle  of  the  fluid  from  the  place  it  was  at  time  0  to  the  place  it  is  at 
time  t.  It  turns  out  that  the  kinetic  energy  of  the  moving  fluid  is  a  right- 
invariant  riemannian  metric  on  the  group  of  difleomorphisms  SDiffD. 

Indeed,  suppose  that  after  time  t  the  flow  of  the  fluid  gives  a  difleomorphism  gt,  and  that 
the  velocity  at  this  moment  of  time  is  given  by  the  vector  field  v.  Then  the  difleomorphism 
realized  by  the  flow  after  time  t  +  z  (where  r  is  small)  will  be  ev'g,  up  to  a  quantity  small  in 
comparison  with  r  (here  em  is  the  one-parameter  group  with  vector  v,  i.e.,  the  phase  flow  of  the 
differential  equation  given  by  the  field  v).  Therefore,  the  field  of  velocities  v  is  obtained  from  the 
vector  g  tangent  to  the  group  at  the  point  g  by  right  translation.  This  also  implies  the  right- 
invariance  of  the  kinetic  energy,  which  is  by  definition  equal  to 

T  = 

(we  assume  the  density  of  the  fluid  to  be  1). 

The  principle  of  least  action  (which  in  mathematical  terms  is  the  definition 
of  an  ideal  fluid)  asserts  that  flows  of  an  ideal  fluid  are  geodesics  in  the  right- 
invariant  metric  just  described  on  the  group  of  difleomorphisms. 

Strictly  speaking,  an  infinite-dimensional  group  of  difleomorphisms  is  not  a  manifold. 
Therefore  the  exact  formulation  of  the  definition  above  requires  additional  work:  we  must 
choose  suitable  functional  spaces,  prove  a  theorem  on  existence  and  uniqueness  of  solutions, 
etc.  Up  to  now  this  has  been  done  only  in  the  case  when  the  dimension  of  the  region  of  the  flow  D 
is  equal  to  2.  However,  we  will  proceed  as  if  these  difficulties  connected  with  infinite  dimensions 
did  not  exist.  Thus  the  following  arguments  are  heuristic  in  character.  It  turns  out  that  many 
of  the  results  can  be  proved  rigorously,  independently  of  the  theory  of  infinite-dimensional 
manifolds. 

We  will  now  indicate  the  form  that  the  general  formulas  introduced  above 
take  in  the  case  G  =  SDiffD,  where  D  is  a  connected  region  with  finite 
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volume  in  a  three-dimensional  riemannian  manifold.  To  do  this  we  must 
first  describe  explicitly  the  bilinear  operation  B:g  x  g  -►  g  defined  in 
section  B  by  the  formula 


<0,  h],  c>  ee  < B(c ,  a),  b >. 

It  is  easy  to  verify  that  in  the  three-dimensional  case  the  vector  field 
B(c,  a)  can  be  expressed  in  terms  of  the  vector  fields  a  and  c  of  our  Lie  algebra 
by  the  formula 

B(c,  a)  =  (curl  c)  a  a  +  grad  a, 


where  a  denotes  the  vector  product,  and  a  the  single-valued  function  on  D 
which  is  uniquely  (up  to  a  constant  summand)  determined  by  the  condition 
Beg  (i.e.,  the  conditions  div  B  =  0  and  B  is  tangent  to  the  boundary  of  D). 

We  note  that  the  operation  B  does  not  depend  on  the  choice  of  orientation, 
since  the  vector  product  and  curl  both  change  sign  with  a  change  of  orienta¬ 
tion. 

Stationary  flows.  Euler’s  equation  for  “angular  velocity”  in  the  case 
G  =  SDiffD  has  the  form  v  =  —B(v,  v ),  since  the  metric  is  right-invariant. 
Therefore,  in  the  case  of  the  group  of  difieomorphisms  of  three-dimensional 
space,  it  takes  the  form  of  “the  equations  of  motion  in  Bernoulli’s  form” 

dv 

—  =  v  a  curl  v  +  grad  a,  div  v  =  0. 


Euler’s  equation  for 
equation” 


momentum  is  written  in  the  form  of  the  “vorticity 


d  curl  v 

dt 


[u,  curl  v\. 


In  particular,  the  vorticity  of  a  stationary  flow  commutes  with  the  field  of 
velocities. 

This  remark  leads  quickly  to  a  topological  classification  of  stationary 
flows  of  an  ideal  fluid  in  three-dimensional  space. 


Theorem  11.  Assume  that  the  region  D  is  bounded  by  a  compact  analytic  surface, 
and  that  the  field  of  velocities  is  analytic  and  not  everywhere  collinear  with 
its  curl.  Then  the  region  of  the  flow  can  be  partitioned  by  an  analytic  sub¬ 
manifold  into  a  finite  number  of  cells,  in  each  of  which  the  flow  is  constructed 
in  a  standard  way.  Namely,  the  cells  are  of  two  types:  those  fibered  into  tori 
invariant  under  the  flow  and  those  fibered  into  surfaces  invariant  under  the 
flow,  dijfeomorphic  to  the  annulus  U  x  S1.  On  each  of  these  tori  the  flow 
lines  are  either  all  closed  or  all  dense,  and  on  each  annulus  all  the  flow  lines 
are  closed. 


To  prove  this  theorem  we  look  at  the  “Bernoulli  surfaces,”  i.e,  the  level 
surfaces  of  the  function  a.  It  follows  from  the  condition  for  a  flow  to  be 
stationary  (v  a  curl  v  =  -grad  a)  that  both  the  flow  lines  and  the  vortex 
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lines  lie  on  the  Bernoulli  surface.  Since  the  fields  of  velocity  and  vorticity 
commute,  the  group  R2  acts  on  the  closed  Bernoulli  surface,  and  it  must  be  a 
torus  (cf.  the  proof  of  Liouville’s  theorem  in  Section  49).  An  analogous 
calculation  for  the  boundary  conditions  on  the  boundary  of  D  shows  that  the 
non-closed  Bernoulli  surfaces  consist  of  annuli  with  closed  flow  lines. 

Remark.  The  analyticity  of  the  field  of  velocities  is  not  very  essential,  but 
it  is  important  that  the  fields  of  velocity  and  vorticity  not  be  collinear. 
Computer  experiments  conducted  by  M.  Henon  show  more  complicated 
behavior  than  described  in  the  theorem  for  the  flow  lines  of  a  stationary  flow 
on  the  three-dimensional  torus;  this  field  is  given  by  the  formulas 

vx  —  A  sin  z  +  C  cos  y  vy  =  B  sin  x  +  A  cos  z, 
vx  =  C  sin  y  -(-  B  cos  x. 

The  formulas  are  selected  so  that  the  vectors  v  and  curl  v  are  collinear.  The 
results  of  Henon’s  calculations  suggest  that  some  flow  lines  densely  fill  up  a 
three-dimensional  region. 

I  Isovorticial fields 

Two-dimensional  hydrodynamics  differs  sharply  from  three-dimensional 
hydrodynamics.  The  essence  of  this  difference  is  contained  in  the  difference 
in  the  geometries  of  the  orbits  of  the  co-adjoint  representation  in  the  two- 
and  three-dimensional  cases.  In  the  two-dimensional  case  the  orbits  are  in 
some  sense  closed  and  behave,  for  example,  like  a  family  of  level  sets  of  a 
function  (more  precisely  of  several  functions :  actually  even  an  infinite  number 
of  functions).  In  the  three-dimensional  case  the  orbits  are  more  complicated; 
in  particular,  they  are  unbounded  (and  perhaps  dense).  The  orbits  of  the  co- 
adjoint  representation  of  the  group  of  diffeomorphisms  of  a  three-dimensional 
riemannian  manifold  can  be  described  in  the  following  way.  Let  V\  and  v2  be 
two  vector  fields  of  velocities  of  a  non-compressible  fluid  in  the  region  D. 
We  say  that  the  fields  vt  and  v2  are  isovorticial  if  there  is  volume-preserving 
diffeomorphism  g.D->D  which  carries  every  closed  contour  y  in  D  to  a  new 
contour  such  that  the  circulation  of  the  first  field  along  the  original  contour 
is  equal  to  the  circulation  of  the  second  field  along  the  new  contour: 


It  is  easy  to  verify  that  the  image  of  an  orbit  of  the  co-adjoint  representation 
in  the  algebra  (under  the  action  of  the  inverse  of  the  inertia  operator,  A  ~ *)  is 
none  other  than  the  set  of  fields  isovorticial  to  the  given  field. 

In  particular,  Theorem  3  now  takes  the  form  of  the  following  law  of  con¬ 
servation  of  circulation : 

Theorem  12.  The  circulation  of  a  field  of  velocities  of  an  ideal  fluid  over  a  closed 
fluid  contour  does  not  change  when  the  contour  is  carried  by  the  flow  to  a 
new  position. 
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We  note  that  if  two  fields  of  velocities  of  a  three-dimensional  ideal  fluid 
on  D  are  isovorticial,  then  the  corresponding  diffeomorphism  carries  the  curl 
of  the  first  field  into  the  curl  of  the  second: 

g *  curl  Dj  =  curl  v2  ■ 

Furthermore,  the  isovorticity  of  two  fields  can  be  defined  as  the  equivalence 
of  the  fields  of  vorticity,  if  the  region  of  the  flow  is  simply-connected.  Therefore, 
the  problem  of  the  oribits  of  the  co-adjoint  representation  in  the  three- 
dimensional  case  includes  the  problem  of  classifying  vector  fields  with 
divergence  zero  up  to  volume-preserving  difleomorphisms.  This  last  problem 
in  three  dimensions  is  hopelessly  difficult. 

We  now  consider  the  two-dimensional  case.  First,  we  translate  the  basic 
formulas  into  notation  convenient  for  considering  the  two-dimensional  case. 
We  assume  that  the  region  D  of  the  flow  is  two-dimensional  and  oriented. 
The  metric  and  orientation  give  a  symplectic  structure  on  £>;  the  vector  field 
of  velocities  has  divergence  zero  and  is  therefore  hamiltonian.  Therefore,  this 
field  is  given  by  a  hamiltonian  function  (many-valued,  in  general,  if  the  region 
D  is  not  simply-connected).  The  hamiltonian  function  of  a  field  of  velocities 
is  called  the  stream  function  in  hydrodynamics,  and  is  denoted  by  i j/.  Thus 

v  —  I  grad  ij/, 

where  /  is  the  operator  of  clockwise  rotation  by  90°. 

The  stream  function  of  the  commutator  of  two  fields  turns  out  to  be  the 
jacobian  (or  the  Poisson  bracket  of  hamiltonian  formalism)  of  the  stream 
functions  of  the  original  fields 

=  ■Wo  </g)- 

The  vector  field  B(c,  a)  is  given,  in  the  two-dimensional  case,  by  the  formula 

B  =  —  (Ai^c)grad  i j/a  +  grad  a, 

where  i j/a  and  < j/c  are  the  stream  functions  of  the  fields  a  and  c,  and  A  = 
div  grad  is  the  laplacian. 

In  the  particular  case  of  the  euclidean  plane  with  cartesian  coordinates  x 
and  y,  the  formulas  for  stream  function,  commutator  and  laplacian  take  the 
particularly  simple  form 

d\]/  d\ j/ 

Vx  =  ~dy  Vy  =  "  fa 

I  =  #£,  V2 

dx  dy  dy  dx 

_eP 

dx2  +  dy2' 
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The  vorticity  (or  curl)  of  a  two-dimensional  field  of  velocities  is  the  scalar 
function  r  such  that  the  integral  around  any  oriented  region  o  in  D  of  the 
product  of  r  with  the  oriented  area  element  is  equal  to  the  circulation  of  the 
field  of  velocities  around  the  boundary  of  o: 


It  is  easy  to  compute  an  expression  for  the  vorticity  in  terms  of  the  stream 
function: 

r  -  -  At jj. 

In  the  two-dimensional  simply-connected  case,  isovorticity  of  fields  vx 
and  v2  means  simply  that  the  functions  rj  and  r2  (the  vorticities  of  these 
fields)  are  carried  to  one  another  under  a  suitable  volume-preserving  dif- 
feomorphism. 

Under  such  conditions  the  two  functions  r\  and  r2  have  the  same  distribu¬ 
tion  function,  i.e., 

mes{x  e  D :  r^x)  <  c}  =  mes{x  e  D :  r2(x)  <  c}, 

for  any  number  c.  Therefore,  if  two  fields  are  in  the  image  of  the  same  orbit 
of  the  co-adjoint  representation,  then  a  whole  series  of  functionals  are  equal, 
for  example,  the  integrals  of  all  powers  of  the  vorticity 


f  r\  dS  =  f  A  dS. 

Jd  Jd 

In  particular,  Euler’s  equations  of  motion  of  a  two-dimensional  ideal  fluid 

dv 


dt 


-I-  yVt>  =  —grad  p  div  v  =  0, 


have  an  infinite  collection  of  first  integrals.  For  example,  the  integral  of  any 
power  of  the  vorticity  of  the  field  of  velocities 


/* 


d_vAk 

dyj 


dx  a  dy 


is  such  a  first  integral. 

The  existence  of  these  first  integrals  (i.e.,  the  relatively  simple  structure  of 
orbits  of  the  co-adjoint  representation)  allows  us  to  prove  theorems  on 
existence  and  uniqueness,  etc.  in  the  two-dimensional  hydrodynamics  of  ar 
ideal  (and  also  of  a  viscous)  fluid;  the  complicated  geometry  of  orbits  of  the 
co-adjoint  representation  in  the  three-dimensional  case  (or,  perhaps,  in¬ 
sufficient  information  about  these  orbits)  makes  the  foundations  of  three- 
dimensional  hydrodynamics  a  very  hard  problem. 
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J  Stability  of  planar  stationary  flows 

Here  we  formulate  general  theorems  about  stationary  rotations  (Theorems 
7,  8,  and  9  above)  for  the  case  of  a  group  of  diffeomorphisms.  We  obtain  in 
this  way  the  following  assertions : 

1.  A  stationary  flow  of  an  ideal  fluid  is  distinguished  from  all  flows  iso- 
vorticial  to  it  by  the  fact  that  it  is  a  conditional  extremum  (or  critical  point) 
of  the  kinetic  energy. 

2.  If  (i)  the  indicated  critical  point  is  actually  an  extremum,  i.e.,  a  local  con¬ 
ditional  maximum  or  minimum,  (ii)  it  satisfies  certain  (generally  satisfied) 
regularity  conditions,  and  (iii)  the  extremum  is  non-degenerate  (the 
second  differential  is  positive-  or  negative-definite),  then  the  stationary 
flow  is  stable  (i.e.,  is  a  Liapunov  stable  equilibrium  position  of  Euler’s 
equation). 

3.  The  formula  for  the  second  differential  of  the  kinetic  energy,  on  the  tangent 
space  to  the  manifold  of  fields  which  are  isovorticial  to  a  given  one,  has  the 
following  form  in  the  two-dimensional  case.  Let  D  be  a  region  in  the 
euclidean  plane  with  cartesian  coordinates  x  and  y.  Consider  a  stationary 
flow  with  stream  function  t j/  —  i J/(x,  y).  Then  2  d2H  =  JJD  ( Sv )2  + 
(Ai/f/VAi/f)((5r)2  dx  dy,  where  Sv  is  the  variation  of  the  field  of  velocities 
(i.e.,  a  vector  of  the  tangent  space  indicated  above),  and  Sr  =  curl  Sv. 

We  note  that  for  a  stationary  flow,  the  gradient  vectors  of  the  stream 
function  and  its  laplacian  are  collinear.  Therefore  the  ratio  Vi/'/VAi/'  makes 
sense.  Furthermore,  in  a  neighborhood  of  every  point  where  the  gradient  of 
the  vorticity  is  not  zero,  the  stream  function  is  a  function  of  the  vorticity 
function. 

The  assertions  introduced  above  lead  to  the  conclusion  that  the  positive 
or  negative  definiteness  of  the  quadratic  form  d2H  is  a  sufficient  condition 
for  stability  of  the  stationary  flow  under  consideration.  This  conclusion  does 
not  formally  follow  from  Theorems  7,  8,  and  9  since  the  application  of  any  of 
our  formulas  in  the  infinite-dimensional  case  requires  justification.  Fortu¬ 
nately,  we  can  justify  the  final  conclusion  about  stability  without  justifying 
the  intermediate  constructions.  Thus  we  can  rigorously  prove  the  following 
a  priori  bounds  (expressing  the  stability  of  a  stationary  flow  in  terms  of  small 
perturbations  of  the  initial  velocity  field). 

Theorem  13.  Suppose  that  the  stream  function  of  a  stationary  flow,  \jy  =  ij/(x,  y), 
in  a  region  D  is  a  function  of  the  vorticity  function  (i.e.,  of  the  function  Ai j/)  not 
only  locally,  but  globally.  Suppose  that  the  derivative  of  the  stream  function 
with  respect  to  the  vorticity  satisfies  the  inequality 

Vi j/ 

c  <  — —  <  C,  where  0  <  c  <  C  <  oo. 

VAi/r 
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Let  i p  +  cp(x,  y,  t )  be  the  stream  function  of  another  flow,  not  necessarily 
stationary.  Assume  that,  at  the  initial  moment,  the  circulation  of  the  velocity 
field  of  the  perturbed  flow  ( with  flow  function  ip  -j-  (p)  around  every  boundary 
component  of  the  region  D  is  equal  to  the  circulation  of  the  original  flow  ( with 
stream  function  ip).  Then  the  perturbation  (p  —  (p(x,  y,t)  at  every  moment 
of  time  is  bounded  in  terms  of  the  initial  perturbation  (p0  =  (p(x,  y,  0)  by  the 
formula 

||  (Vcp)2  +  c(A(p)2  dx  dy  <  ||  (V<p0)2  +  C(A<p0)2  dx  dy. 

If  the  stationary  flow  satisfies  the  inequality 

Vi b 

c  <  -  Trtr  <  C,  0  <  c  <  C  <  co, 

VAi p 

then  the  perturbation  <p  is  bounded  in  terms  of  (p0  by  the  formula 


-  (V(p)2  dx  dy  < 


(V<p0)2  dx  dy. 


This  theorem  implies  the  stability  of  a  stationary  flow  in  the  case  of  a 
positive-definite  quadratic  form 

ffw  + 


with  respect  to  V<p  (where  (p  is  a  constant  function  on  every  component  of  the 
boundary  of  D  whose  gradient  flow  is  zero  over  every  boundary  component), 
and  also  in  the  case  of  a  negative  definite  form 


max 


Vi p  ' 
AVt } 


(A (p)2  dx  dy. 


Example  1.  Consider  a  planar  parallel  flow  in  the  strip  Yi  <  y  <  Y2  in  the 
(x,  y)-plane  with  velocity  profile  v(y)  (i.e.,  with  velocity  field  (y(y),  0)).  Such 
a  flow  is  stationary  for  any  velocity  profile.  To  make  the  region  of  the  flow 
compact,  we  impose  the  condition  that  the  velocity  fields  of  all  flows  under 
consideration  be  periodic  with  period  X  in  the  x-coordinate. 

The  conditions  of  Theorem  13  are  fulfilled  if  the  velocity  profile  has  no 
points  of  inflection  (i.e.,  if  d2v/dy2  /  0).  We  come  to  the  conclusion  that 
planar  parallel  flows  of  an  ideal  fluid  with  no  inflection  points  in  the  velocity 
profile  are  stable. 

The  analogous  proposition  in  the  linearized  problem  is  called  Rayleigh's 
theorem. 
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We  emphasize  that  in  Theorem  13  it  is  not  a  question  of  stability  “in  a  linear  approxima¬ 
tion,”  but  of  actual  strict  Liapunov  stability  (i.e.,  with  respect  to  finite  perturbations  in  the 
nonlinear  problem).  The  difference  between  these  two  forms  of  stability  is  substantial  in  this 
case,  since  our  problem  has  a  hamiltonian  character  (cf.  Theorem  4);  for  hamiltonian  systems 
asymptotic  stability  is  impossible,  so  stability  in  a  linear  approximation  is  always  neutral  and 
insufficient  for  a  conclusion  about  the  stability  of  an  equilibrium  position  of  the  nonlinear 
problem. 


Example  2.  Consider  the  planar -parallel  flow  on  the  torus 

{(x,  y),  x  mod  X,  y  mod  2 n} 

with  velocity  field  v  =  (sin  y,  0),  parallel  to  the  x-axis.  This  field  is  deter¬ 
mined  by  the  stream  function  ip  =  —  cos  y  and  has  vorticity  r  —  —  cos  y. 
The  velocity  profile  has  two  inflection  points,  but  the  stream  function  can 
be  expressed  as  a  function  of  the  vorticity.  The  ratio  Vt/r/VAt/r  is  equal  to 
minus  one.  By  applying  Theorem  13  we  can  convince  ourselves  of  the 
stability  of  our  stationary  flow  in  the  case  when 


(A (p)2  dx  dy  > 


(V<p)2  dx  dy 


for  all  functions  (p  of  period  X  in  x  and  2 n  in  y.  It  is  easy  to  calculate  that  the 
last  inequality  is  satisfied  for  X  <2  n  and  violated  for  X  >  2n. 

Thus  Theorem  13  implies  the  stability  of  a  sinusoidal  stationary  flow  on  a 
short  torus,  when  the  period  in  the  direction  of  the  basic  flow  (2f)  is  less  than 
the  width  of  the  flow  (2n).  On  the  other  hand,  we  can  directly  verify  that  on  a 
long  torus  (for  X  >  2n)  our  sinusoidal  flow  is  unstable.97  Thus,  in  this 
example,  the  sufficient  condition  for  stability  from  Theorem  13  turns  out  to 
be  necessary. 

We  should  note  that  in  general  an  indefinite  quadratic  form  d2H  does  not  imply  instability 
of  the  corresponding  flow.  In  general,  an  equilibrium  position  of  a  hamiltonian  system  can  be 
stable  even  though  the  hamiltonian  function  at  this  position  is  neither  a  maximum  nor  a  mini¬ 
mum.  The  quadratic  hamiltonian  H  =  pi  +  q  \  —  p\  —  q\  is  the  simplest  example  of  this  kind. 


K  Riemannian  curvature  of  a  group  of  dijfeomorphisms 

The  expression  for  the  curvature  of  a  Lie  group  provided  with  a  one- 
sided-invariant  metric,  introduced  in  subsection  E,  makes  sense  also  for  the 
group  SDiffD  of  diffeomorphisms  of  a  riemannian  domain  D.  This  group  is 
the  configuration  space  for  an  ideal  fluid  filling  the  domain  D.  The  kinetic 
energy  defines  a  right-invariant  metric  on  SDiffD .  The  number  which  we 
obtain  by  formally  applying  the  formula  for  the  curvature  of  a  Lie  group  to 

97  Cf.,  for  example,  the  article  of  L.  D.  Meshalkin  and  Y.  G.  Sinai,  “Investigation  of  the  stability 
of  a  stationary  solution  of  a  system  of  equations  for  the  plane  movement  of  an  incompressible 
viscous  liquid.”  J.  Applied  Math.  Mech.  25  (1962),  1700  1705. 
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this  infinite-dimensional  group  is  naturally  called  the  curvature  of  the  group 
SDiffD. 

Calculation  of  the  curvature  of  a  group  of  dififeomorphisms  has  been 
carried  out  completely  only  in  the  case  of  a  flow  on  the  two-dimensional 
torus  with  euclidean  metric.  Such  a  torus  is  obtained  from  the  euclidean 
plane  U2  by  identifying  points  whose  difference  lies  in  some  lattice  (a  discrete 
subgroup  of  the  plane).  An  example  of  such  a  lattice  is  the  set  of  points  with 
integral  coordinates.  In  general,  to  obtain  an  arbitrary  lattice  T  we  may 
replace  the  square  lying  at  the  basis  of  this  special  lattice  by  any  parallelogram. 

Now  consider  the  Lie  algebra  of  vector  fields  with  divergence  zero  on  the 
torus  with  a  single-valued  stream  function.  The  corresponding  group 
S0Diff  T2  consists  of  volume-preserving  dififeomorphisms  which  leave  the 
center  of  mass  of  the  torus  fixed.  It  is  embedded  in  the  group  SDiff  T2  of  all 
volume-preserving  dififeomorphisms  as  a  totally  geodesic  submanifold  (i.e., 
a  submanifold  such  that  each  of  its  geodesics  is  a  geodesic  in  the  ambient 
manifold). 

The  proof  consists  of  the  fact  that  if,  at  the  initial  moment,  a  velocity 
field  of  an  ideal  fluid  has  a  single-valued  stream  function,  then  at  all  other 
moments  of  time  the  stream  function  will  also  be  single-valued;  this  follows 
from  the  law  of  conservation  of  momentum. 

We  will  now  investigate  the  curvature  of  the  group  S0Diff  T2  in  all  pos¬ 
sible  two-dimensional  directions  passing  through  the  identity  of  the  group 
(the  curvature  of  the  group  SDiff  T2  in  every  such  direction  is  the  same,  since 
the  submanifold  S0DiffT2  is  totally  geodesic). 

Choose  an  orientation  on  U2.  Then  elements  of  the  Lie  algebra  of  the 
group  S0Diff  T2  can  be  thought  of  as  real  functions  on  the  torus  having 
average  value  zero  (a  field  with  divergence  zero  is  obtained  from  such  a 
function  by  considering  it  to  be  a  stream  function).  Therefore,  a  two-dimen¬ 
sional  direction  in  the  tangent  space  to  the  group  S0DiffT2  is  determined  by 
a  pair  of  functions  on  the  torus  with  average  value  zero. 

We  will  give  such  a  function  by  the  set  of  its  Fourier  coefficients.  It  is  con¬ 
venient  to  carry  out  all  calculations  with  Fourier  series  in  the  complex  do¬ 
main.  We  let  ek  (where  k,  called  a  wave  vector,  is  a  point  of  the  euclidean 
plane)  denote  the  function  whose  value  at  a  point  x  of  our  plane  is  equal  to 
euk,  x)  iguc[1  a  function  determines  a  function  on  the  torus  if  it  is  T-periodic, 
i.e.,  if  adding  a  vector  from  the  lattice  T  to  x  does  not  change  the  value  of  the 
function. 

In  other  words,  the  scalar  product  ( k,  x)  must  be  a  multiple  of  2n  for  all 
x  g  T.  All  such  vectors  k  belong  to  a  lattice  T*  on  R2.  The  functions  ek,  where 
k  e  T*,  form  a  complete  system  in  the  space  of  complex  functions  on  the  torus. 

We  now  complexify  our  Lie  algebra,  scalar  product  <  ,  ),  commutator 
[  ,  ]  and  operation  B  in  the  algebra,  as  well  as  the  riemannian  connection 
and  curvature  tensor  Q,  so  that  all  these  functions  become  (multi-)  linear  in 
the  complex  vector  space  of  the  complexified  Lie  algebra.  The  functions  ek 
(where  k  e  T*,  k  #  0)  form  a  basis  of  this  vector  space. 
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Theorem  14.  The  explicit  formulas  for  the  scalar  product ,  commutator,  opera¬ 
tion  B,  connection,  and  curvature  of  a  right-invariant  metric  on  the  group 
S0DiffT2  have  the  following  form  -. 

(ekf  e,)  —  0  for  k  +  l  #  0, 

(ek,  e-k}  =  k2S\ 


lek ,  =  (k  a  l)ek  +  l; 


B(ek,  ef  =  bk  ,ek  +  l,  where  bk  ,  =  (k  a  /) 


(Ac  +  l)2 


Vek?i  =  dltk  +  lek+l,  where  du<v 


(t>  A  u)(u  •  v) 


Rk,i,m,n  =  0  if  k  +  /  +  m  +  n  ^  0;  if  k  4-  /  4-  m  +  n  =  0,  then  Rk  l  m  n  — 
(aina km  -  almakn)S,  where  auv  =  (u  a  v)2/\u  +  v\. 


In  these  formulas,  S  is  the  area  of  the  torus,  and  u  a  v  the  area  of  the 
parallelogram  spanned  by  u  and  v  (with  respect  to  the  chosen  orientation  of 
IR2).  The  parentheses  denote  the  euclidean  scalar  product  in  the  plane,  and 
angled  brackets  denote  the  scalar  product  in  the  Lie  algebra. 

The  proof  of  this  theorem  is  in  the  first  article  listed  in  the  introduction  to 
this  appendix. 

The  formulas  above  allow  us  to  calculate  the  curvature  in  any  two- 
dimensional  direction.  These  calculations  show  that  in  most  directions  the 
curvature  is  negative,  but  in  a  few  it  is  positive.  Consider,  for  instance,  some 
fluid  flow,  i.e.  a  geodesic  of  our  group.  By  Jacobi’s  equations,  the  stability  of 
this  geodesic  is  determined  by  the  curvatures  in  the  directions  of  all  possible 
two-dimensional  planes  passing  through  the  velocity  vector  of  the  geodesic 
at  each  of  its  points. 

Assume  now  that  the  flow  under  consideration  is  stationary.  Then  the  geo¬ 
desic  is  a  one-parameter  subgroup  of  our  group.  From  this  it  follows  that  the 
curvatures  in  the  directions  of  all  planes  passing  through  velocity  vectors  of 
the  geodesic  at  all  of  its  points  are  equal  to  the  curvatures  in  the  corresponding 
planes  going  through  the  velocity  vector  of  this  geodesic  at  the  initial  moment 
of  time  (Proof:  right  translate  to  the  identity  element  of  the  group).  Thus  the 
stability  of  a  stationary  flow  depends  only  on  the  curvatures  in  the  directions 
of  those  two-dimensional  planes  in  the  Lie  algebra  which  contain  the  vector 
of  the  Lie  algebra  which  is  the  velocity  field  of  the  stationary  flow. 

Consider,  for  example,  the  simplest  parallel  sinusoidal  stationary  flow. 
Such  a  flow  is  given  by  the  stream  function 

p _ ek  ^  -  k 

Q  ~  2  ' 

Consider  any  other  real  vector  of  the  algebra,  q  =  £  xtet  (so  x_,  =  xf.  We 
deduce  easily  from  Theorem  14  that 
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Theorem  15.  The  curvature  of  the  group  S0Diff  T2  in  any  two-dimensional 
plane  containing  the  direction  £  is  non-positive.  Namely, 

<Q(^,  rjf)  —  —  -  X  ak,i\xi  +  */  +  2fcl2- 


From  this  formula  it  follows,  in  particular,  that 


1 .  The  curvature  is  equal  to  zero  only  for  those  two-dimensional  planes  which 
consist  of  parallel  flows  in  the  same  direction  as  so  that  [£,  rf\  =  0; 

2.  The  curvature  in  the  plane  defined  by  the  flow  functions  £  =  cos  kx, 
rj  =  cos  lx  is 


K  = 


k2  +  l2 
4  S 


sin2  a  sin2  /?, 


where  S  is  the  area  of  the  torus,  a  is  the  angle  between  k  and  /,  and  (3  is  the 
angle  between  k  +  l  and  k  —  /; 

3.  In  particular,  the  curvature  of  the  group  of  difleomorphisms  of  the  torus 
{(x,  y)mod  2 n}  in  directions  determined  by  the  velocity  fields  (sin  y,  0) 
(0,  sin  x)  is  equal  to 


L  Discussion 

It  is  natural  to  expect  that  the  curvature  of  a  group  of  difleomorphisms  is 
related  to  the  stability  of  geodesics  in  this  group  (i.e.  to  the  stability  of  flows 
of  an  ideal  fluid)  in  the  same  way  as  the  curvature  of  a  finite-dimensional  Lie 
group  is  related  to  the  stability  of  geodesics  on  it.  Namely,  negative  curvature 
causes  exponential  instability  of  geodesics.  The  characteristic  path  length 
(the  average  path  length  in  which  errors  in  the  initial  conditions  grow  e 
times)  has  order  of  magnitude  l/J  -  K.  Thus,  knowing  the  curvatures  of  a 
group  of  difleomorphisms  allows  us  to  estimate  the  time  for  which  we  can 
predict  the  development  of  the  flow  of  an  ideal  fluid  by  means  of  an  approxi¬ 
mate  initial  velocity  field  before  the  error  grows  to  a  large  order. 


It  should  be  emphasized  that  instability  of  a  flow  of  an  ideal  fluid  is  here  understood  dif¬ 
ferently  than  in  section  K;  it  is  a  question  of  exponential  instability  of  the  motion  of  the  fluid, 
not  of  its  velocity  field.  It  is  possible  for  a  stationary  flow  to  be  a  Liapunov  stable  solution  of 
Euler’s  equation  while  the  corresponding  motion  of  the  fluid  is  exponentially  unstable.  The 
reason  is  that  a  small  change  in  the  velocity  field  of  a  fluid  can  induce  an  exponentially  growing 
change  in  the  motion  of  the  fluid.  In  such  a  case  (stability  of  the  solution  of  Euler’s  equation 
and  negative  curvature  of  the  group)  we  can  predict  the  velocity  field,  but  we  cannot  predict 
the  motion  of  the  fluid  mass  without  a  great  loss  of  accuracy. 


The  formulas  mentioned  above  for  curvature  can  be  used  even  for  rough 
estimates  of  the  time  over  which  a  long-term  dynamical  prediction  of  the 
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weather  is  impossible,  if  we  agree  to  a  few  simplifying  assumptions.  These 
simplifying  assumptions  consist  of  the  following: 

1.  The  earth  has  the  shape  of  a  torus  obtained  by  factoring  the  plane  by  a 
square  lattice. 

2.  The  atmosphere  is  a  two-dimensional  homogeneous  non-compressible 
non-viscous  fluid. 

3.  The  motion  of  the  atmosphere  is  approximately  a  “tradewind  current,” 
parallel  to  the  equator  of  the  torus  and  having  sinusoidal  velocity  profile. 

To  calculate  the  characteristic  path  length  we  must  then  estimate  the 
curvature  of  the  group  S0DiffT2  in  directions  containing  the  “tradewind 
current”  %  from  Theorem  15.  To  do  this  we  will  look  at  T2  as  {(x,y)mod  2 n}, 
k  =  (0,  1).  In  other  words,  we  look  at  27r-periodic  flows  on  the  (x,  y)-plane 
close  to  a  stationary  flow,  parallel  to  the  x-axis  and  with  sinusoidal  velocity 
profile 

v  =  (sin  y,  0). 

It  is  easy  to  see  from  the  formula  in  Theorem  15  that  the  curvature  of  the 
group  S0DiffT2  in  the  planes  containing  our  tradewind  current  v  varies 
within  the  limits 

2 

—  -  <  K  <  0,  where  S  =  4 n2  is  the  area  of  the  torus. 

Here  the  lower  limit  is  obtained  by  a  rather  crude  estimate.  However,  a 
direction  with  curvature  K  —  —1/2 S  certainty  exists,  and  there  are  many 
other  directions  with  curvature  of  approximately  the  same  size.  In  order  to 
make  a  rough  estimate  of  the  characteristic  path  length,  we  make  the  rough 
guess  K0  =  —  1/2 S  as  value  of  the  “mean  curvature.” 

If  we  agree  to  start  from  this  value  K0  of  the  curvature,  we  obtain  the 
characteristic  path  length 

s  =  (^K0ri  =  v7^- 

The  velocity  of  motion  with  respect  to  the  group  which  corresponds  to 
our  tradewind  current  is  equal  to  y/S/2  (since  the  average  square  value  of 
the  sine  is  ^).  Therefore ,  the  time  it  takes  for  our  flow  to  travel  the  characteristic 
path  length  is  equal  to  2.  The  fastest  particles  of  the  fluid  go  a  distance  of  2  after 
this  time,  i.e.,  1  jn  of  the  entire  orbit  around  the  torus. 

Thus,  if  we  take  our  value  of  the  mean  curvature,  then  the  error  grows  by 
e*  %  20  after  the  time  of  one  orbit  of  the  fastest  particle.  Taking  the  value 
100  km/hr  as  the  maximal  velocity  of  the  tradewind  current,  we  get  400  hours 
for  the  time  of  orbit,  i.e.,  less  than  three  weeks. 

Thus,  if  at  the  initial  moment  the  state  of  the  weather  was  known  with 
small  error  e,  then  the  order  of  magnitude  of  the  error  of  prediction  after  n 
months  would  be 

10kn£,  where  k  %  n  log10e  2.5. 
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For  example,  to  predict  the  weather  two  months  in  advance  we  must  have 
initial  data  with  five  more  digits  of  accuracy  than  the  prediction  accuracy. 
Practically,  this  means  that  calculating  the  weather  for  such  a  period  is 
impossible. 

It  is  clear  that  the  estimates  mentioned  here  are  not  very  sharp,  and  the 
model  we  took  is  very  simplified.  The  choice  of  the  value  of“  mean  curvature’ 
also  requires  justification. 
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The  symplectic  manifolds  of  classical  mechanics  are  most  often  phase  spaces 
of  lagrangian  mechanical  systems,  i.e.,  cotangent  bundles  of  configuration 
spaces. 

An  entirely  different  series  of  symplectic  manifolds  arises  in  algebraic 
geometry. 

For  example,  any  smooth  complex  algebraic  manifold  (given  by  a  system 
of  polynomial  equations  in  complex  projective  space)  has  a  natural  symplectic 
structure. 

The  construction  of  a  symplectic  structure  on  an  algebraic  manifold  is 
based  on  the  fact  that  complex  projective  space  itself  has  a  particular  sym¬ 
plectic  structure,  namely  the  imaginary  part  of  its  hermitian  structure. 

A  The  hermitian  structure  of  complex 
projective  space 

Recall  that  n-dimensional  complex  projective  space  CP"  is  the  manifold  of  all 
complex  lines  passing  through  the  point  0  in  an  ( n  +  l)-dimensional  com¬ 
plex  vector  space  Cn+1.  To  construct  a  symplectic  structure  on  CP"  we  use 
the  hermitian  structure  in  the  corresponding  vector  space  C"  +  1. 

Recall  that  a  hermitian  scalar  product  (or  hermitian  structure)  on  a  complex  vector  space 
is  a  complex  linear  function  on  pairs  of  vectors,  which  (1)  is  linear  in  the  first  and  anti-linear 
in  the  second  variable,  (2)  changes  its  value  to  the  complex  conjugate  when  the  arguments  are 
interchanged,  and  (3)  becomes  a  positive-definite  real  quadratic  form  if  we  take  the  arguments 
equal : 

<A£,t7>  =  ■*<£,*>  <n,0  =  <Zn>  <60  >o 

for  £  #  0. 

An  example  of  a  hermitian  scalar  product  is 
(1)  <C,  =  £  ikhk, 

where  ck  and  nk  are  the  coordinates  of  the  vectors  c  and  n  in  some  basis. 

A  basis  for  which  a  hermitian  scalar  product  has  the  form  (1)  always  exists,  and  is  called  a 
hermitian-orthonormal  basis. 

The  real  and  imaginary  parts  of  a  hermitian  scalar  product  are  real  bilinear  forms.  The 
first  is  symmetric,  and  the  second  skew-symmetric,  and  both  are  nondegenerate: 

<£.  n>  =  (c,  n)  +  ife,  nl  (£,  n)  =  (n,  i)  Li,  n]  =  -  in,  ?]• 

The  quadratic  form  (£,  c)  is  positive-definite. 

Thus  a  hermitian  structure  <  ,  )  on  a  complex  vector  space  gives  it  a  euclidean  structure 
(  ,  )  and  a  symplectic  structure  [  ,  ].  These  two  structures  are  related  to  the  complex  structure 
by  the  relation 

Li,  >?]  =  (i,  ini 

We  will  now  define  a  riemannian  metric  on  complex  projective  space. 
To  do  this,  consider  the  unit  sphere 

S2n+1  =  {z  e  C"+ 1 :  <z,z>  =  1} 

in  the  corresponding  vector  space  C"+ 1.  This  sphere  inherits  the  riemannian 
metric  from  C"+ 1 .  Every  complex  line  intersects  our  sphere  in  a  great  circle. 
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Definition.  The  distance  between  two  points  of  complex  projective  space  is 

the  distance  between  the  two  corresponding  circles  on  the  unit  sphere. 

We  note  that  these  two  circles  are  parallel  in  the  sense  that  the  distance 
from  any  point  of  one  of  the  circles  to  the  other  is  the  same  (Proof:  multiplica¬ 
tion  of  z  by  ellp  preserves  the  metric  on  the  sphere).  This  circumstance  allows 
us  at  once  to  write  down  an  explicit  formula  (2)  for  the  riemannian  metric  on 
the  complex  projective  space  given  by  the  construction  defined  above. 

In  fact,  let  p  denote  the  mapping 

p:  C"+  NO  — *■  CP\ 

taking  a  point  z  ^  0  of  the  vector  space  C"  +  1  to  the  complex  line  passing 
through  0  and  z. 

Every  vector  C  tangent  to  CPn  at  the  point  pz  can  be  represented  (in  many 
ways)  as  the  image  of  a  vector  at  the  point  z;  under  this  map 

C  =  p*&  rq+1. 


Theorem.  The  square  of  the  length  of  a  vector  C  in  the  riemannian  metric 
defined  above  is  given  by  the  formula 


(2) 


ds2(C)  = 


<&  <H><z.z>  -  <&z)<z,  P 

<z,  z>2 


Proof.  Assume  first  that  the  point  z  lies  on  the  unit  sphere  Sln+i. 

Decompose  the  vector  £  into  two  components:  one  in  the  complex  line  determined  by  the 
vector  z  and  the  other  in  the  hermitian-orthogonal  direction.  Note  that  hermitian-orthogonal 
to  the  vector  z  means  euclidean-orthogonal  to  the  vectors  z  and  iz.  The  vector  z  is  a  euclidean 
normal  vector  to  the  sphere  S2" ' 1  at  z.  The  vector  iz  is  a  vector  tangent  to  the  circle  in  which 
the  sphere  intersects  the  complex  line  passing  through  z.  Thus  the  component  rj  of  the  vector  £ 
which  is  hermitian-orthogonal  to  z  is  tangent  to  the  sphere  S2n+1  and  euclidean-orthogonal 
to  the  circle  in  which  the  sphere  intersects  the  line  pz. 

By  the  definition  of  the  metric  on  CPn,  the  riemannian  square  of  the  length  of  the  vector  £ 
is  equal  to  the  euclidean  square  length  of  the  component  ij  of  £  which  is  hermitian-orthogonal 
to  z. 

We  calculate  the  component  r\  of  £,  hermitian-orthogonal  to  z.  We  write  our  decomposition  as 

£  =  cz  +  t],  where  <i?,z>  =  0. 

By  hermitian  multiplication  with  z,  we  find 

<c,  z>  =  c<z,  z>, 
so 

<z,  z>c  -  <e,  z>z 


Calculating  the  hermitian  square  of  the  vector  we  find  (p,  p}  -  (p,  £>  and 


<Z,  z)<£,Q  -  <(,:)(:,;) 


Thus,  formula  (2)  is  proved  for  points  z  of  the  unit  sphere.  The  general  case  follows  from  looking 
at  the  homothetic  transformation  z  -»  z/|zj.  ^ 
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Note  that  our  construction  allows  us  to  define  not  only  a  euclidean 
structure  (2),  but  also  a  hermitian  structure  on  the  tangent  space  to  CP". 
Consider  the  hermitian-orthogonal  complement  H  to  the  direction  of  the 
vector  z  in  the  space  TC"+1,  where  zeS2n+1.  The  map  p*:  H  ->  T(CP")pz 
maps  H  isomorphically  (as  we  showed  above)  onto  the  tangent  space  to  CP" 
and  carries  over  the  hermitian  structure  from  H. 

It  is  clear  that  the  scalar  square  defined  by  this  hermitian  structure  is  given 
by  formula  (2).  Therefore,  the  formula  for  the  hermitian  scalar  product  in 
the  tangent  space  to  CP"  can  be  written  down  without  further  calculations: 


(3) 


<Ci,  C2>  = 


<^2)<z,r)  -  <^,z)<z,  g2) 
<z,  z)2 


for  any  vectors  £l5  £2  in  TC"+1  satisfying  the  relation  =  (*e  T(CP”)pz. 
We  note  that  in  formula  (3)  the  point  z  does  not  necessarily  lie  on  the  unit 
sphere. 

The  euclidean  and  hermitian  structures  (2)  and  (3)  constructed  on  the 
tangent  spaces  to  CP"  are  not  invariant  under  all  projective  transformations 
of  the  manifold  CP",  but  are  invariant  under  those  which  are  given  by  unitary 
(preserving  the  hermitian  structure)  linear  transformations  of  the  vector 
space  Cn+ 1. 


B  The  symplectic  structure  of  complex 
projective  space 

We  consider  the  imaginary  part  of  the  hermitian  form  (3),  taken  with  co¬ 
efficient  —  \jn  (the  reason  for  taking  this  coefficient  is  explained  in  Problem  1, 
Section  C): 

(4)  £!((„  C2)  =  -  -  Im<C„  f2>. 

n 

Like  the  imaginary  part  of  any  hermitian  form,  the  real  bilinear  form  Q  on 
the  tangent  space  to  complex  projective  space  is  skew-symmetric  and  non¬ 
degenerate. 


Theorem.  The  differential  2-form  Q  gives  a  symplectic  structure  on  complex 
projective  space. 


Proof.  We  need  only  verify  that  the  form  £2  is  closed. 

Consider  the  exterior  derivative  dQ  of  the  form  £2.  This  differential  3-form  on  CP”  is  invariant 
with  respect  to  mappings  induced  by  unitary  transformations  of  the  space  Cn+  *.  It  follows  from 
this  that  it  is  equal  to  zero. 

To  see  this,  we  look  at  a  hermitian-orthonormal  basis  ely . . . ,  e„  of  the  tangent  space  to 
CP"  at  some  point  r.  Then  the  vectors  et,  . . . ,  e„,  ie1, ....  ie„  form  a  euclidean-orthonormal 
R-basis.  We  will  show  that  the  value  of  the  form  d£2  on  any  triple  of  these  R-basis  vectors  is 
equal  to  zero.  (We  assume  that  n  >  1 ;  for  n  =  1  there  is  nothing  to  prove.) 

Note  that  in  any  triple  of  R-basis  vectors  at  least  one  is  hermitian-orthogonal  to  the  two 
others.  Denote  this  vector  by  e.  It  is  easy  to  construct  a  unitary  transformation  of  the  space  C"+ 1 
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inducing  a  motion  on  CP"  which  fixes  the  point  2  and  the  hermitian-orthogonal  complement 
to  e,  and  changes  the  direction  of  e. 

The  value  of  the  form  dSl  on  our  three  vectors,  e ,  /,  and  g  is  equal  to  its  value  on  the  triple 
—  e,  /,  and  g  by  the  invariance  of  the  form  Q,  and  is  hence  equal  to  zero.  □ 

Remark.  Another  method  of  constructing  the  same  symplectic  structure 
on  complex  projective  space  consists  of  the  following.  Consider  small  oscil¬ 
lations  of  a  mathematical  pendulum  with  an  ( n  -f  l)-dimensional  configura¬ 
tion  space.  We  make  use  of  the  integral  of  energy  to  decrease  by  1  the  degree  of 
freedom  of  the  system.  The  phase  space  obtained  after  this  operation  is  CP", 
and  the  symplectic  structure  on  it  agrees  with  the  form  Q  described  above  up 
to  a  factor. 

One  other  method  of  constructing  a  symplectic  structure  on  CP”  uses  the  fact  that  this 
space  may  be  represented  as  one  of  the  orbits  of  the  co-adjoint  representation  of  a  Lie  group, 
and  on  every  such  orbit  there  is  always  a  standard  symplectic  structure  (cf.  Appendix  2,  Sec¬ 
tion  A).  For  the  Lie  group  we  can  take  the  group  of  unitary  (preserving  the  hermitian  metric) 
operators  in  an  (n  +  l)-dimensional  complex  space.  The  orbits  of  the  co-adjoint  representation 
in  this  case  are  the  same  as  of  the  adjoint  representation.  In  the  adjoint  representation  the  operator 
of  reflection  through  a  hypcrplane  (which  changes  the  sign  of  the  first  coordinate  and  leaves 
the  others  fixed)  has  CP"  as  its  orbit,  since  the  reflection  operator  is  uniquely  determined  by 
the  complex  line  orthogonal  to  the  hyperplane. 

C  Symplectic  structure  on  algebraic  manifolds 

We  will  now  obtain  a  symplectic  structure  on  any  complex  submanifold  M 
of  complex  projective  space.  Let  j:  M  -+  CP"  be  an  embedding  of  the  complex 
manifold  M  into  complex  projective  space.  The  riemannian,  hermitian,  and 
symplectic  structures  on  projective  space  induce  corresponding  structures  on 
M.  For  example,  the  symplectic  structure  on  M  is  given  by  the  formula 

Qm  =  j*  Q. 

Theorem.  The  differential  form  QM  gives  a  symplectic  structure  on  the  manifold 

M. 

Proof.  The  nondegeneracy  of  the  2-form  QM  follows  from  the  fact  that  M 
is  a  complex  submanifold.  In  fact,  the  quadratic  form 

(f ,  0  =  iO 

is  positive  definite  (it  is  induced  by  the  riemannian  metric  on  CP").  Therefore, 
the  bilinear  form  (£,  r/)  =  QM(£,  irl)  is  nondegenerate.  This  means  that  the 
form  Qm  is  also  nondegenerate.  The  form  QM  is  closed  since  the  form  Q  is 
closed.  D 

Remark.  In  the  same  way  as  for  complex  projective  space,  we  define  a 
hermitian  structure  on  the  tangent  spaces  of  its  complex  submanifolds;  the 
symplectic  structure  is  the  imaginary  part. 
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A  complex  manifold  with  a  hermitian  metric  whose  imaginary  part  is  a 
closed  form  (i.e.  a  symplectic  structure)  is  called  a  K abler  manifold  and  its 
hermitian  metric  a  Kahler  metric.  Many  important  results  have  been 
obtained  in  the  geometry  of  Kahler  manifolds;  in  particular,  they  have 
remarkable  topological  properties  (cf.,  for  example,  A.  Weil,  “Varietes 
Kahleriennes,”  Hermann,  1958). 

Not  all  symplectic  manifolds  admit  a  Kahler  structure. 

Problem  1.  Calculate  the  symplectic  structure  Q  in  the  affine  chart  w  =  zt :  z0  of  the  projective 
line  CP1. 


Answer.  H  =  (l/;r)(dx  a  dy)j(\  +  x2  +  y2)2,  where  w  =  x  +  iy.  The  coefficient  in  the  de¬ 
finition  of  the  form  fl  is  chosen  to  obtain  the  usual  orientation  of  the  complex  line  (dx  a  dy) 
and  so  that  the  integral  of  the  form  ft  along  the  whole  projective  line  is  equal  to  1. 

Problem  2.  Show  that  the  symplectic  structure  fl  in  the  affine  chart  wk  —  zkz$  ‘(/c  =  1, . . . ,  n) 
of  the  projective  space  CP"  =  {(z0:  z, : . . . :  z„)}  is  given  by  the  formula 

Q  i  Eo<*<! gnKdw,  -  w,dwt)bMw,  -  w,dwk) 

2tc  (Xt=oK*^))2 

By  convention,  w0  =  1. 

Remark.  Differential  forms  on  a  complex  space  with  complex  values  (such  as  dwk  and  dw„) 
are  defined  as  complex  linear  functions  of  tangent  vectors;  if  wk  =  xk  +  iyk,  then 

dwk  -  dxk  +  i  dyk  dwk  =  dxk  -  i  dyk. 

The  space  ofsuch  forms  in  C"  has  complex  dimension  2n;  the  2n  forms  dwk,  dwk(k  =  1, ....  n), 
for  example,  form  a  C-basis,  or  the  2 n  forms  dxk,  dyk. 

Exterior  multiplication  is  defined  in  the  usual  way  and  obeys  the  usual  rules.  For  example, 

dw  a  dw  —  (dx  +  idy )  a  (dx  —  i  dy)  =  —2  i  dx  a  dy. 

Let  /  be  a  real-smooth  function  on  C"  (with  complex  values,  in  general).  An  example  of 
such  a  function  is  |w|2  =  £  The  differential  of  the  function  /  is  a  complex  1-form.  There¬ 
fore,  it  can  be  decomposed  in  the  basis  dwk,  dwk.  The  coefficients  of  this  decomposition  are 
called  the  partial  derivatives  “with  respect  to  wl”  and  “with  respect  to  vvt”: 

,  df  % 

df  =  ~  dw  +  —  dw. 

dw  dw 

In  calculating  exterior  derivatives  it  is  also  convenient  to  separate  into  differentiation  d' 
with  respect  to  the  variable  w  and  d"  with  respect  to  the  variable  iv,  so  that  d  =  d'  +  d". 

For  example,  for  a  function  / 

df  df 

d'f  =  .  dw  d"f  -  —  dw. 
dw  dw 

For  the  differential  1-form 

0)  =  £  ak  dwk  +  bk  dwk. 
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the  operators  d'  and  d"  are  defined  analogously: 

d' uj  =  X  d'ak  a  dwk  +  d'bk  a  dwk 
d"a  =  £ d"ak  a  dwk  +  d"bk  a  dwk. 

Problem  3.  Show  that  the  symplectic  structure  Q  on  the  affine  chart  (h\  —  zkZq  l)  of  the  projective 
space  CP"  is  given  by  the  formula 

Q  =  ^-d'd"  In  X  |wt|2. 

2it  k  =  0 
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An  odd-dimensional  manifold  cannot  admit  a  symplectic  structure.  The 
analogue  of  a  symplectic  structure  for  odd-dimensional  manifolds  is  a  little 
less  symmetric,  but  also  a  very  interesting  structure — the  contact  structure. 

The  source  of  symplectic  structures  in  mechanics  are  phase  spaces  (i.e., 
cotangent  bundles  to  configuration  manifolds),  on  which  there  is  always  a 
canonical  symplectic  structure.  The  source  of  contact  structures  are  mani¬ 
folds  of  contact  elements  of  configuration  spaces. 

A  contact  element  to  an  n-dimensional  smooth  manifold  at  some  point  is 
an  ( n  —  l)-dimensional  plane  tangent  to  the  manifold  at  that  point  (i.e.,  an 
(n  —  l)-dimensional  subspace  of  the  n-dimensional  tangent  space  at  that 
point). 

The  set  of  all  contact  elements  of  an  n-dimensional  manifold  has  a  natural 
smooth  manifold  structure  of  dimension  2 n  —  1.  It  turns  out  that  there  is  an 
interesting  additional  “contact  structure”  on  this  odd-dimensional  manifold 
(we  describe  this  below). 

The  manifold  of  contact  elements  of  a  riemannian  n-dimensional  manifold 
is  closely  related  to  the  (2 n  -  1  )-dimensional  manifold  of  unit  tangent  vectors 
of  this  riemannian  n-dimensional  manifold,  or  to  the  (2 n  —  l)-dimensional 
energy  level  manifold  of  a  point  mass  moving  on  the  riemannian  manifold 
under  inertia.  The  contact  structures  on  these  (2 n  -  l)-dimensional  mani¬ 
folds  are  closely  related  to  the  symplectic  structure  on  the  2n-dimensional 
phase  space  of  the  point  (i.e.,  the  cotangent  bundle  of  the  original  n-dimen¬ 
sional  riemannian  manifold). 

A  Definition  of  contact  structure 

Definition.  A  contact  structure  on  a  manifold  is  a  smooth  field  of  tangent 
hyperplanes98  satisfying  a  nondegeneracy  condition  which  will  be  formu¬ 
lated  later. 

To  formulate  this  condition  we  examine  what  a  field  of  hyperplanes  looks 
like  in  general  in  a  neighborhood  of  a  point  in  an  N-dimensional  manifold. 

Example.  Let  N  =  2.  Then  the  manifold  is  a  surface  and  a  field  of  hyper¬ 
planes  is  a  field  of  straight  lines.  Such  a  field  in  a  neighborhood  of  a  point  is 
always  constructed  very  simply,  namely,  as  a  field  of  tangents  to  a  family 
of  parallel  lines  in  a  plane.  More  precisely,  one  of  the  basic  results  of  the  local 
theory  of  ordinary  differential  equations  is  that  it  is  possible  to  change  any 
smooth  field  of  tangent  lines  on  a  manifold  into  a  field  of  tangents  to  a  family 
of  straight  lines  in  euclidean  space  by  using  a  diffeomorphism  in  a  sufficiently 
small  neighborhood  of  any  point  of  the  manifold. 

If  N  >  2,  then  a  hyperplane  is  not  a  line,  and  the  question  becomes 
significantly  more  complicated.  For  example,  most  fields  of  two-dimensional 

98  4  hyperplane  in  a  vector  space  is  a  subspace  of  dimension  1  less  than  the  dimension  of  the 
space  (i.e.,  the  zero  level  set  of  a  linear  function  which  is  not  identically  zero).  A  tangent  hyper¬ 
plane  is  a  hyperplane  in  a  tangent  space. 
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tangent  planes  in  ordinary  three-dimensional  space  cannot  be  diffeo- 
morphically  mapped  onto  a  field  of  parallel  planes.  The  reason  is  that  there 
exist  fields  of  tangent  planes  for  which  it  is  impossible  to  find  “integral  sur¬ 
faces,”  i.e.,  surfaces  which  have  the  prescribed  tangent  plane  at  each  point. 

The  nondegeneracy  condition  for  a  field  of  hyperplanes  which  enters  into 
the  definition  of  contact  structure  consists  of  the  stipulation  that  the  field  of 
hyperplanes  must  be  maximally  far  from  a  field  of  tangents  to  a  family  of 
hyperplanes.  In  order  to  measure  this  distance,  as  well  as  to  convince  our¬ 
selves  of  the  existence  of  fields  without  integral  hypersurfaces,  we  must  make 
a  few  constructions  and  calculations." 

B  Frobenius ’  integrability  condition 

We  will  consider  some  point  on  an  N-dimensional  manifold  and  try  to 
construct  a  surface  passing  through  this  point  and  tangent  to  a  given  field 
of  (N  —  l)-dimensional  planes  at  each  point  (an  integral  surface). 

To  this  end  we  introduce  a  coordinate  system  onto  a  neighborhood  of 
this  point  so  that  at  the  point  itself  one  coordinate  surface  is  tangent  to  a 
plane  of  the  field.  We  will  call  this  plane  the  horizontal  plane,  and  will  call 
the  coordinate  axis  not  lying  in  it  the  vertical  axis. 

Construction  of  an  integral  surface.  An  integral  surface,  if  one  exists,  is  the 
graph  of  a  function  of  N  —  1  variables  near  the  origin.  To  construct  it,  we 
can  take  some  smooth  path  on  the  horizontal  plane.  Then  the  vertical  lines 
over  this  path  form  a  two-dimensional  surface  (cylinder);  our  field  of  planes 
intersects  its  tangent  planes  in  a  field  of  tangent  lines.  The  integral  surface 
we  are  looking  for,  if  it  exists,  intersects  this  cylinder  in  an  integral  curve  of  the 
field  of  lines,  starting  at  the  origin.  Such  an  integral  curve  always  exists 
independent  of  whether  an  integral  surface  exists.  Thus  we  can  construct  an 
integral  surface  over  the  horizontal  plane  by  moving  along  smooth  curves  in 
the  latter. 

In  order  to  obtain  a  smooth  integral  surface  from  all  the  integral  curves 
we  need  the  result  of  our  construction  to  be  independent  of  the  path,  deter¬ 
mined  only  by  its  endpoint.  In  particular,  for  a  circuit  of  a  closed  path  in  a 
neighborhood  of  the  origin  in  the  horizontal  plane,  the  integral  curve  on  the 
cylinder  must  close  up. 

It  is  easy  to  construct  examples  of  fields  of  planes  for  which  such  closure 
does  not  take  place  and,  therefore,  for  which  an  integral  surface  does  not 
exist.  Such  fields  of  planes  are  called  nonintegrable. 

Example  of  a  nonintegrable  field  of  planes.  In  order  to  give  a  field  of  planes 
and  measure  numerically  the  deviation  from  closure,  we  introduce  the  follow¬ 
ing  notation.  We  note  first  of  all  that  a  field  of  hyperplanes  can  be  given  locally 
by  a  differential  1-form;  a  plane  in  the  tangent  space  gives  a  1-form  up  to 

99  From  now  on,  we  will  omit  the  prefix  “hyper-".  If  we  wish,  we  may  assume  that  we  are  in 
three-dimensional  space  and  a  hypersurface  is  an  ordinary  surface.  The  higher-dimensional 
case  is  analogous  to  the  three-dimensional  case. 
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multiplication  by  a  nonzero  constant.  We  will  choose  this  constant  so  that 
the  value  of  the  form  on  the  vertical  basic  vector  is  equal  to  1. 

This  condition  can  be  satisfied  in  some  neighborhood  of  the  origin  since 
the  plane  of  the  field  at  zero  does  not  contain  the  vertical  direction.  This 
condition  determines  the  form  uniquely  (given  the  field  of  planes). 

A  field  of  planes  in  ordinary  threesspace  which  does  not  have  an  integral 
surface  can  be  given,  for  example,  by  the  1-form 

co  —  x  dy  +  dz, 

where  x  and  y  are  the  horizontal  coordinates  and  z  is  the  vertical.  The  proof 
of  the  fact  that  this  field  of  planes  is  nonintegrable  will  be  given  below. 

Construction  of  a  2-form  measuring  nonintegrability.  With  the  help  of  the 
form  giving  the  field,  we  can  measure  the  degree  of  nonintegrability.  This  is 
done  using  the  following  construction  (Figure  236). 


Figure  236  Integral  curves  constructed  for  a  non-integrable  field  of  planes 

Consider  a  pair  of  vectors  emanating  from  the  origin  and  lying  in  the 
horizontal  plane  of  our  coordinate  system.  Construct  a  parallelogram  on 
them.  We  obtain  two  paths  from  the  origin  to  the  opposite  vertex.  Over  each 
of  these  two  paths  we  can  construct  an  integral  curve  (with  two  sections)  as 
described  above.  As  a  result,  in  general,  there  arise  two  different  points  over 
the  vertex  of  the  parallelogram  opposite  to  the  origin.  The  difference  in  the 
heights  of  these  points  is  a  function  of  our  pair  of  vectors.  This  function  is 
skew-symmetric  and  equal  to  zero  if  one  of  the  vectors  is  equal  to  zero.  Thus 
the  linear  part  of  the  Taylor  series  of  this  function  is  zero  at  zero,  and  the 
quadratic  part  of  its  Taylor  series  is  a  bilinear  skew-symmetric  form  on  the 
horizontal  plane. 

If  the  field  is  integrable,  then  this  2-form  is  equal  to  zero.  Therefore,  this 
2-form  can  be  considered  as  a  measure  of  the  nonintegrability  of  the  field. 

The  2-form  is  well  defined.  We  constructed  the  2-form  above  with  the  help 
of  coordinates.  However,  the  value  of  our  2-form  on  a  pair  of  tangent  vectors 
does  not  depend  on  the  coordinate  system,  but  only  on  the  1-form  used  to 
give  the  field. 

To  convince  ourselves  of  this,  it  is  enough  to  prove  the  following. 

Theorem.  The  2-form  defined  above  agrees  with  the  exterior  derivative  of  the 

l-form  co,  dco\(0=cn  on  the  null  space  of  co. 
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Proof.  We  will  show  that  the  difference  in  the  heights  of  the  two  points  obtained  as  a  result 
of  our  two  motions  along  the  sides  of  the  parallelogram  is  the  same  as  the  integral  of  the  1-form  to 
over  the  four  sides  of  the  parallelogram,  up  to  a  quantity  small  of  third  order  with  respect  to 
the  sides  of  the  parallelogram. 

To  this  end  we  note  that  the  height  of  the  rise  of  an  integral  curve  along  any  path  of  length  e 
emanating  from  the  origin  has  order  e2,  since  at  the  origin  the  plane  of  the  field  is  horizontal. 
Therefore,  the  integrals  of  the  2-form  dco  over  all  four  vertical  areas  over  the  sides  of  the  paral¬ 
lelogram  bounded  by  the  integral  curves  and  the  horizontal  plane,  have  order  £3  if  the  sides 
are  of  order  s. 

The  integrals  of  the  form  co  along  integral  curves  are  exactly  equal  to  zero.  Therefore,  by 
Stokes’  formula,  the  increase  in  height  along  the  integral  curve  lying  over  any  of  the  sides  of  the 
parallelogram  is  equal  to  the  integral  of  the  1-form  id  along  this  side  up  to  a  quantity  of  third- 
order  smallness. 

Now  the  theorem  follows  directly  from  the  definition  of  exterior  differentiation.  □ 

Some  arbitrariness  remains  in  the  choice  of  the  1-form  w  which  we  used  to 
construct  our  2-form.  Namely,  the  form  to  is  defined  by  the  field  of  planes 
only  up  to  multiplication  by  a  function / which  is  never  zero.  In  other  words, 
we  could  have  started  with  the  form  /to.  Then  we  would  have  obtained  the 
2-form 

dfco  —  f  dco  +  df  a  to, 

which,  on  our  plane,  differs  from  the  2-form  dw  by  multiplication  by  the 
nonzero  number  / (0). 

Thus  the  2-form  constructed  on  the  plane  of  the  field  is  defined  invariantly 
up  to  multiplication  by  a  nonzero  constant. 

Condition  for  integrability  of  a  field  of  planes 

Theorem.  If  a  field  of  hyperplanes  is  integrable ,  then  the  2-form  constructed 

above  on  a  plane  of  the  field  is  equal  to  zero.  Conversely,  if  the  2-form  con¬ 
structed  on  every  plane  of  the  field  is  equal  to  zero,  then  the  field  is  integrable. 


Proof.  The  first  assertion  of  the  theorem  is  clear  by  the  construction  of  the  2-form.  The  proof 
of  the  second  assertion  can  be  carried  out  by  exactly  the  same  reasoning  we  used  to  prove  the 
commutativity  of  phase  flows  for  which  the  Poisson  bracket  of  the  velocity  fields  was  equal  to 
zero.  We  can  simply  refer  to  this  commutativity,  applying  it  to  the  integral  curves  arising  over 
the  lines  of  the  coordinate  directions  in  the  horizontal  plane.  □ 


Theorem.  The  integrability  condition  for  a  field  of  planes, 

dco  =  0  for  o)  =  0 

is  equivalent  to  the  following  condition  of  Frobenius: 

co  a  dco  -  0. 


Proof.  We  consider  the  value  of  the  3-form  above  on  any  three  distinct  coordinate  vectors. 
Only  one  of  these  vectors  can  be  the  vertical.  Therefore,  of  all  the  terms  entering  into  the  defini¬ 
tion  of  the  value  of  the  exterior  product  of  the  three  vectors,  only  one  is  nonzero:  the  product  of 
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the  value  of  the  form  m  on  the  vertical  vector  with  the  value  of  the  form  dm  on  the  pair  of 
horizontal  vectors.  If  the  field  given  by  the  form  is  integrable,  then  the  second  factor  is  zero, 
so  our  3-form  is  zero  on  arbitrary  triples  of  vectors. 

Conversely,  if  the  3-form  is  equal  to  zero  for  any  vectors,  then  it  is  equal  to  zero  for  any 
triple  of  coordinate  vectors,  of  which  one  is  vertical  and  the  other  two  horizontal.  The  value 
of  the  3-form  on  such  a  triple  is  equal  to  the  product  of  the  value  of  to  on  the  vertical  vector 
with  the  value  of  dm  on  the  pair  of  horizontal  vectors.  The  first  factor  is  not  zero,  so  the  second 
must  be  zero,  and  thus  the  form  dm  is  zero  on  a  plane  of  the  field.  □ 


C  Nondegenerate  fields  of  hyperplanes 

Definition.  A  field  of  hyperplanes  is  said  to  be  nondegenerate  at  a  point  if  the 
rank  of  the  2-form  dco\m=0  in  the  plane  of  the  field  passing  through  this 
point  is  equal  to  the  dimension  of  the  plane. 

This  means  that  for  any  nonzero  vector  in  our  plane,  we  can  find  another 
vector  in  the  plane  such  that  the  value  of  the  2-form  on  this  pair  of  vectors 
is  not  zero. 

Definition.  A  field  of  planes  is  called  nondegenerate  on  a  manifold  if  it  is  non¬ 
degenerate  at  every  point  of  the  manifold. 

Note  that  on  an  even-dimensional  manifold  there  cannot  be  a  nondegen¬ 
erate  field  of  hyperplanes;  on  such  a  manifold  a  hyperplane  is  odd-dimen¬ 
sional,  and  the  rank  of  every  skew-symmetric  bilinear  form  on  an 
odd-dimensional  space  is  less  than  the  dimension  of  the  space  (cf.  Section  44). 

Nondegenerate  fields  of  hyperplanes  do  exist  on  odd-dimensional  mani¬ 
folds. 

Example.  Consider  a  euclidean  space  of  dimension  2m  +  1  with  coordinates 
x,  y,  and  z  (where  x  and  y  are  vectors  in  an  m-dimensional  space  and  z  is  a 
number).  The  1-form 


(o  =  x  dy  +  dz 

defines  a  field  of  hyperplanes.  The  plane  of  the  field  passing  through  the  origin 
has  equation  dz  =  0.  We  take  x  and  y  as  coordinates  in  this  hyperplane. 
Therefore,  in  this  plane  of  the  field  our  2-form  can  be  written  in  the  form 

dw L=0  =  dx  a  dy  =  dx j  a  dyx  +  ■  •  •  +  dxm  a  dym. 

The  rank  of  this  form  is  2m,  so  our  field  is  nondegenerate  at  the  origin,  and 
thus  also  in  a  neighborhood  of  the  origin  (in  fact,  this  field  of  planes  is 
nondegenerate  at  all  points  of  the  space). 

Now,  finally,  we  can  give  the  definition  of  a  contact  structure  on  a  mani¬ 
fold:  a  contact  structure  on  a  manifold  is  a  nondegenerate  field  of  tangent 
hyperplanes. 
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D  The  manifold  of  contact  elements 

The  term  “contact  structure”  stems  from  the  fact  that  there  is  always  such  a 
structure  on  a  manifold  of  contact  elements  of  a  smooth  n-manifold. 

Definition.  A  hyperplane  (dimension  n  —  1)  tangent  to  a  manifold  at  some 
point  is  called  a  contact  element,  and  this  point  the  point  of  contact. 

The  set  of  all  contact  elements  of  an  n-dimensional  manifold  has  the  struc¬ 
ture  of  a  smooth  manifold  of  dimension  In  —  1. 


In  fact,  the  set  of  contact  elements  with  a  fixed  point  of  contact  is  the  set  of  all  (n  —  1  )-dimen- 
sional  subspaces  of  an  n-dimensional  vector  space,  i.e.,  a  projective  space  of  dimension  n  —  1. 
To  give  a  contact  element  we  must  therefore  give  the  n  coordinates  of  the  point  of  contact 
together  with  the  n  —  1  coordinates  defining  a  point  of  an  (n  -  l)-dimensional  projective 
space  -2 n  —  1  coordinates  in  all. 


The  manifold  of  all  contact  elements  of  an  n-dimensional  manifold  is  a 
fiber  bundle  whose  base  is  our  manifold  and  whose  fiber  is  (n  —  l)-dimen- 
sional  projective  space. 

Theorem.  The  bundle  of  contact  elements  is  the  projectivization  of  the  cotangent 
bundle:  it  can  be  obtained  from  the  cotangent  bundle  by  changing  every 
cotangent  n-dimensional  vector  space  into  an  (n  —  l)-dimensional  pro¬ 
jective  space  (a  point  of  which  is  a  line  passing  through  the  origin  in  the 
cotangent  space). 


Proof.  A  contact  element  is  given  by  a  1-form  on  the  tangent  space,  for  which  this  element  is 
a  zero  level  set.  This  form  is  not  zero,  and  it  is  determined  up  to  multiplication  by  a  nonzero 
number.  But  a  form  on  the  tangent  space  is  a  vector  of  the  cotangent  space.  Therefore,  a 
nonzero  form  on  the  tangent  space,  determined  up  to  a  multiplication  by  a  nonzero  number, 
is  a  nonzero  vector  of  the  cotangent  space,  determined  up  to  a  multiplication  by  a  nonzero 
number,  i.e,,  a  point  of  the  projectivized  cotangent  space.  D 


The  contact  structure  on  the  manifold  of  contact  elements.  In  the  tangent 
space  to  the  manifold  of  contact  elements  there  is  a  distinguished  hyperplane. 
It  is  called  the  contact  hyperplane  and  is  defined  in  the  following  way. 

We  fix  a  point  of  the  ( 2n  —  l)-dimensional  manifold  of  contact  elements 
on  an  n-dimensional  manifold.  We  can  think  of  this  point  as  an  (n  —  1)- 
dimensional  plane  tangent  to  the  original  n-dimensional  manifold. 

Definition.  A  tangent  vector  to  the  manifold  of  contact  elements  at  a  fixed 
point  belongs  to  the  contact  hyperplane  if  its  projection  onto  the  n- 
dimensional  manifold  lies  in  the  (n  —  l)-dimensional  plane  which  is  the 
given  point  of  the  manifold  of  contact  elements. 
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In  other  words,  a  displacement  of  a  contact  element  is  tangent  to  the 
contact  hyperplane  if  the  velocity  of  the  point  of  contact  belongs  to  this 
contact  element,  no  matter  how  the  element  turns. 

Example.  We  take  some  submanifold  of  our  n-dimensional  manifold  and 
consider  all  ( n  —  l)-dimensional  planes  tangent  to  it  (i.e.,  contact  elements). 
The  set  of  all  such  contact  elements  forms  a  smooth  submanifold  of  the 
(2 n  —  l)-dimensional  manifold  of  all  contact  elements.  The  dimension  of 
this  submanifold  is  equal  to  n  —  1,  no  matter  what  the  dimension  of  the 
original  submanifold  (which  could  be  ( n  —  l)-dimensional,  or  have  smaller 
dimension,  down  to  a  curve  or  even  a  point). 

This  (n  —  l)-dimensional  submanifold  of  the  (2 n  —  l)-dimensional 
manifold  of  all  contact  elements  is  tangent  at  each  of  its  points  to  the  field  of 
contact  hyperplanes  (by  the  definition  of  contact  hyperplane).  Thus  the 
field  of  (2n  —  2)-dimensional  contact  hyperplanes  has  an  (n  —  l)-dimensional 
integral  manifold. 


Problem.  Does  this  field  of  planes  have  integral  manifolds  of  higher  dimensions? 


Answer.  No. 


Problem.  Is  it  possible  to  give  the  field  of  contact  hyperplanes  by  a  differential  1-form  on  the 
manifold  of  ail  contact  elements? 


Answer,  No,  even  if  the  underlying  n-dimensional  manifold  is  a  euclidean  space  (for  example, 
the  ordinary  two-plane). 


We  will  show  below  that  the  field  of  contact  hyperplanes  on  the  (2 n  —  1)- 
dimensional  manifold  of  all  contact  elements  of  an  n-dimensional  manifold  is 
nondegenerate.  The  proof  uses  the  symplectic  structure  of  the  cotangent 
bundle.  The  manifold  of  contact  elements  is  related  by  a  simple  construction 
to  the  space  of  the  cotangent  bundle  (the  projectivization  of  which  is  the 
manifold  of  contact  elements).  Moreover,  the  nondegeneracy  of  the  field  of 
contact  planes  of  the  projectivized  bundle  is  closely  related  to  the  non¬ 
degeneracy  of  the  2-form  giving  the  symplectic  structure  of  the  cotangent 
bundle. 

The  construction  we  are  concerned  with  will  be  carried  out  below  in  a 
somewhat  more  general  situation.  Namely,  for  any  odd-dimensional  mani¬ 
fold  with  a  contact  structure  we  can  construct  its  “  symplectification  ” — a 
symplectic  manifold  whose  dimension  is  one  larger.  The  inter-relation  be¬ 
tween  these  two  manifolds— the  odd-dimensional  contact  manifold  and  the 
even-dimensional  symplectic  manifold— is  the  same  as  between  the  manifold 
of  contact  elements  with  its  contact  structure  and  the  cotangent  bundle  with 
its  symplectic  structure. 
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E  Symplectification  of  a  contact  manifold 

Consider  an  arbitrary  contact  manifold,  i.e.,  a  manifold  of  odd  dimension  N 
with  a  nondegenerate  field  of  tangent  hyperplanes  (of  even  dimension  N  —  1). 
We  will  call  these  planes  contact  planes.  Every  contact  plane  is  tangent  to 
the  contact  manifold  at  one  point.  We  will  call  this  point  the  point  of  contact. 

Definition.  A  contact  form  is  a  linear  form  on  the  tangent  space  at  the  point  of 
contact  of  the  manifold  such  that  its  zero  set  is  the  contact  plane. 

It  should  be  emphasized  that  the  contact  form  is  not  a  differential  form 
but  an  algebraic  linear  form  on  one  tangent  space. 

Definition.  The  symplectification  of  a  contact  manifold  is  the  set  of  all  contact 
forms  on  the  contact  manifold,  provided  with  the  structure  of  a  sym- 
plectic  manifold  as  defined  below. 

We  note  first  of  all  that  the  set  of  all  contact  forms  on  a  contact  manifold 
has  a  natural  structure  of  a  smooth  manifold  of  even  dimension  N  +  1. 
Namely,  we  can  consider  the  set  of  all  contact  forms  as  the  space  of  a  bundle 
over  the  original  contact  manifold.  Projection  onto  the  base  is  the  mapping 
associating  the  contact  form  to  the  point  of  contact. 

The  fiber  of  this  bundle  is  the  set  of  contact  forms  with  a  common  point  of 
contact.  All  such  forms  are  obtained  from  one  another  by  multiplication  by  a 
nonzero  number  (so  that  they  determine  the  same  contact  plane).  Thus  the 
fiber  of  our  bundle  is  one-dimensional:  it  is  the  line  minus  a  point. 

We  also  note  that  the  group  of  nonzero  real  numbers  acts  on  the  manifold 
of  all  contact  forms  by  the  operation  of  multiplication,  i.e.,  the  product  of  a 
contact  form  and  a  nonzero  number  is  again  a  contact  form.  In  this  way  the 
group  acts  on  our  bundle,  leaving  every  fiber  fixed  (upon  multiplication  of  a 
form  by  a  number  the  point  of  contact  is  not  changed). 

Remark.  So  far  we  have  not  used  the  nondegeneracy  of  the  field  of  planes. 
Nondegeneracy  is  needed  only  to  insure  that  the  manifold  obtained  by 
symplectification  is  symplectic. 

Example.  Consider  the  manifold  (of  dimension  2  n  —  1)  of  all  contact  elements 
of  an  n-dimensional  smooth  manifold.  On  the  manifold  of  elements  there  is  a 
field  of  hyperplanes  (which  we  defined  above  and  called  the  contact  hyper¬ 
planes).  Therefore,  we  can  symplectify  the  manifold  of  contact  elements. 

As  a  result  of  symplectification  we  obtain  a  2n-dimensional  manifold. 
This  manifold  is  the  space  of  the  cotangent  bundle  of  the  original  n-dimen- 
sional  manifold  without  zero  vectors.  The  action  by  the  multiplicative  group 
of  real  numbers  on  the  fiber  reduces  to  multiplication  of  vectors  of  the  co¬ 
tangent  space  by  a  number. 
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On  the  cotangent  bundle  there  is  a  distinguished  1-form  “ p  dq .”  There  is 
an  analogous  1-form  on  any  manifold  obtained  by  symplectification  from  a 
contact  manifold. 

The  canonical  l-form  on  the  symplectified  space 

Definition.  The  canonical  l-form  in  the  symplectified  space  of  a  contact 
manifold  is  the  differential  1-form  a  whose  value  on  any  vector  £  tangent 
to  the  symplectified  space  at  some  point  p  (Figure  237)  is  equal  to  the  value 
on  the  projection  of  the  vector  £  onto  the  tangent  plane  to  the  contact 
manifold  of  the  1-form  on  this  tangent  plane  which  is  the  point  p: 

«(f)  = 

where  n  is  the  projection  of  the  symplectified  space  onto  the  contact 
manifold. 


Figure  237  Symplectification  of  a  contact  manifold 


Theorem.  The  exterior  derivative  of  the  canonical  l-form  on  the  symplectified 
space  of  a  contact  manifold  is  a  nondegenerate  2-form. 

Corollary.  The  symplectified  space  of  a  contact  manifold  has  a  symplectic 
structure  which  is  canonically  (i.e.,  uniquely,  without  arbitrariness )  deter¬ 
mined  by  the  contact  structure  of  the  underlying  odd-dimensional  manifold. 


Proof  of  theorem.  Since  the  assertions  of  the  theorem  are  local,  it  is  sufficient  to  prove  it  in 
a  small  neighborhood  of  a  point  of  the  manifold.  In  a  small  neighborhood  of  a  point  on  a  contact 
manifold,  a  field  of  contact  planes  can  be  given  by  a  differential  form  w  on  the  contact  manifold. 
We  fix  such  a  l-form  ax 

By  the  same  token  we  can  represent  the  symplectified  space  of  the  contact  manifold  over 
our  neighborhood  as  the  direct  product  of  the  neighborhood  and  the  line  minus  a  point.  Namely, 
we  associate  to  the  pair  (.x,  A)— where  x  is  a  point  of  the  contact  manifold  and  A  is  a  nonzero 
number  -the  contact  form  given  by  the  differential  1  -form  Am  on  the  tangent  space  at  the  point  ,x. 
Thus  in  the  part  of  the  symplectified  space  we  are  considering,  we  have  defined  a  function  A 
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whose  values  are  nonzero  numbers.  It  should  be  emphasized  that  A  is  only  a  local  coordinate  on 
the  symplectified  manifold  and  that  this  coordinate  is  not  defined  canonically;  it  depends  on 
the  choice  of  differential  1-form  w.  The  canonical  1-form  a  can  be  written  in  our  notation  as 

a  =  An*to 

and  does  not  depend  on  the  choice  of  w.  The  exterior  derivative  of  the  t-form  ot  thus  has  the  form 

ch,  =  dk  a  n*m  +  An *dio. 

We  will  show  that  the  2-form  d a  is  nondegenerate,  i.e.,  that  for  any  vector  c  tangent  to 
the  symplectification,  we  can  find  a  vector  r\  such  that  doc(c.h)  /  0.  We  select  from  vectors 
tangent  to  the  symplectification,  those  of  the  following  type.  We  call  a  vector  £  vertical  if  it 
is  tangent  to  the  fiber,  i.e..  if  7t*c  =  0.  We  call  the  vector  £  horizontal  if  it  is  tangent  to  a  level 
surface  of  the  function  A.  i.e.,  if  dA(c)  =  0.  We  call  the  vector  £  a  contact  vector  if  its  projection 
onto  the  contact  manifold  lies  in  the  contact  plane,  i.e.,  if  co(n^c)  =  0  (in  other  words,  if  a(c)  =  0). 
We  calculate  the  value  of  the  form  dot  on  a  pair  of  vectors  (c,  t])  : 

da(£,  i/)  =  (dA  a  n*o){£,  >;)  +  (An*do)(£,  if)- 

Assume  that  £  is  not  a  contact  vector.  For  rj,  take  a  nonzero  vertical  vector,  so  that  nj j  =  0. 
Then  the  second  term  is  equal  to  zero,  and  the  first  term  is  equal  to 

-dA(ri)a>(n*£) 

which  is  not  zero  since  i)  is  a  nonzero  vertical  vector  and  £  is  not  a  contact  vector.  Thus  if  c 
is  not  a  contact  vector,  we  have  found  an  if  for  which  dx(c,  17)  #  0, 

Now  assume  that  c  is  a  contact  vector  and  not  vertical.  Then  for  if  we  take  any  contact 
vector.  Now  the  first  term  is  entirely  zero,  and  the  second  (and  therefore  the  sum)  is  reduced 
to  A  doj(n* c,  71*17).  Since  c  is  not  vertical,  the  vector  ti*c  lying  in  the  contact  plane  is  not  zero. 
But  the  2-form  dut  is  nondegenerate  on  the  contact  plane  (by  the  definition  of  contact  structure). 
Thus  there  is  a  contact  vector  tf  such  that  dca(nm£,  71*17)  ^  0.  Since  A  +  0,  we  have  found  a 
vector  r\  for  which  dz{£,  if)  /  0. 

Finally,  if  the  vector  £  is  nonzero  and  vertical,  then  for  if  we  can  take  any  vector  which  is 
not  a  contact  vector.  Q 

Remark.  The  constructions  of  the  1-form  a  and  the  2-form  da  are  valid 
for  an  arbitrary  manifold  with  a  field  of  hyperplanes,  and  do  not  depend  on 
the  condition  of  nondegeneracy.  However,  the  2-form  da  will  define  a 
symplectic  structure  only  in  the  case  when  the  field  of  planes  is  nondegenerate. 

Proof.  Assume  that  the  field  is  degenerate,  i.e.,  that  there  exists  a  nonzero  vector  c'  in  a  plane 
of  the  field  such  that  da>{c\  rj  )  =  0  for  all  vectors  r\‘  in  this  plane.  For  such  a  £',  the  quantity 
dw(£',  t]')  as  a  function  of  t}'  is  a  linear  form,  identically  equal  to  zero  on  the  plane  of  the  field. 
Therefore  there  is  a  number  ft.  not  dependent  on  i/'  such  that 

d(u(c',  rf')  =  fiu>(t}') 


for  all  vectors  i/'  of  the  tangent  space. 

We  now  take  for  c  a  tangent  vector  to  the  symplectified  manifold  for  which  n^c  =  c'.  Such 
a  vector  c  is  determined  up  to  addition  of  a  vertical  summand,  and  we  will  show  that  for  a  suitable 
choice  of  this  summand  we  will  have 

t/x(c,  tf)  =  0  for  all  >]. 
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The  first  term  of  the  formula  for  da  is  equal  to  d, (since  w(7r„f)  =  0).  The  second 
term  is  equal  to  X  doM.n^q,  =  Xncoin^t]).  We  choose  the  vertical  component  of  the  vector  £ 
so  that  dX(c)  =  —  X/i.  Then  £  will  be  skew-orthogonal  to  all  vectors  t]. 

Thus  if  da.  is  a  symplectic  structure,  then  the  underlying  field  of  hyperplanes  is  a  contact 
structure,  [— j 

Corollary.  The  field  of  contact  hyper  planes  defines  a  contact  structure  on  the 
manifold  of  all  contact  elements  of  any  smooth  manifold. 

Proof.  The  symplectification  of  the  (2 n  -  l)-dimensional  manifold  of  all 
contact  elements  on  an  n-dimensional  smooth  manifold,  constructed  with 
help  of  the  field  of  (2 n  -  2)-dimensional  contact  planes,  is  by  construction 
the  space  of  the  cotangent  bundle  of  the  underlying  n-dimensional  manifold 
without  the  zero  cotangent  vectors.  The  canonical  1-form  a  on  the  sym¬ 
plectification  is,  by  its  definition,  the  same  1-form  on  the  cotangent  bundle 
that  we  called  “pdg”  and  which  is  fundamental  in  hamilton  mechanics  (cf. 
Section  37).  Its  derivative  d<x  is  therefore  the  form  “ dp  a  dq ”  defining  the 
usual  symplectic  structure  of  a  phase  space.  Therefore  the  form  da  is  non¬ 
degenerate,  and,  by  the  preceding  remark,  the  field  of  contact  hyperplanes  is 
nondegenerate.  □ 

F  Contact  diffeomorphisms  and  vector  fields 

Definition.  A  diffeomorphism  of  a  contact  manifold  to  itself  is  called  a 
contact  diffeomorphism  if  it  preserves  the  contact  structure,  i.e.,  carries 
every  plane  of  a  given  structure  of  a  field  of  hyperplanes  to  a  plane  of  the 
same  field. 


Example.  Consider  the  (2 n  —  l)-dimensional  manifold  of  contact  elements 
of  an  n-dimensional  smooth  manifold  with  its  usual  contact  structure.  To 
each  contact  element  we  can  ascribe  a  “positive  side”  by  choosing  one  of  the 
halves  into  which  this  element  divides  the  tangent  space  to  the  n-dimensional 
manifold. 

We  will  call  a  contact  element  with  a  chosen  side  a  (transver sally)  oriented 
contact  element. 

The  oriented  contact  elements  on  our  n-dimensional  manifold  form  a 
(2 n  -  1)-  dimensional  smooth  manifold  with  a  natural  contact  structure  (it 
is  a  double  covering  of  the  manifold  of  ordinary  nonoriented  contact 
elements). 

Now  assume  that  we  are  given  a  riemannian  metric  on  the  underlying 
n-dimensional  manifold.  Then  there  is  a  “geodesic  flow”100  on  the  manifold 
of  oriented  contact  elements.  The  transformation  after  time  t  by  this  flow 
is  defined  as  follows.  We  go  out  from  the  point  of  contact  of  a  contact  element 
along  the  geodesic  orthogonal  to  it  and  directed  to  the  side  orienting  the 
element.  In  the  course  of  time  t  we  will  move  the  point  of  contact  along  the 

1 00  Strictly  speaking,  we  need  to  require  that  the  riemannian  manifold  be  complete,  i.e.,  geodesics 
can  be  continued  without  limit. 
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geodesic,  keeping  the  element  orthogonal  to  the  geodesic.  After  time  t  we 
obtain  a  new  oriented  element.  We  have  defined  the  geodesic  flow  of  oriented 
contact  elements. 

Theorem.  The  geodesic  flow  of  oriented  contact  elements  consists  of  contact 
diffeomorphisms. 

The  proof  of  this  theorem  will  not  be  presented  since  it  is  just  a  reformula¬ 
tion  in  new  terms  of  Huygens’  principle  (cf.  Section  46). 

Definition.  A  vector  field  on  a  contact  manifold  is  called  a  contact  vector  field 
if  it  is  the  velocity  field  of  a  one-parameter  (local)  group  of  contact 
diffeomorphisms. 


Theorem.  The  Poisson  bracket  of  contact  vector  fields  is  a  contact  vector  field. 
The  contact  vector  fields  form  a  subalgebra  in  the  Lie  algebra  of  all  smooth 
vector  fields  on  a  contact  manifold. 

The  proof  follows  directly  from  the  definitions. 

G  Symplectification  of  contact  diffeomorphisms 
and  fields 

For  every  contact  diffeomorphism  of  a  contact  manifold  there  is  a  canonically 
constructed  symplectic  diffeomorphism  of  its  symplectification.  This  sym- 
plectic  diffeomorphism  commutes  with  the  action  of  the  multiplicative  group 
of  real  numbers  on  the  symplectified  manifold  and  is  defined  by  the  following 
construction. 

Recall  that  a  point  of  the  symplectified  manifold  is  a  contact  form  on  the 
underlying  contact  manifold. 


Definition.  The  image  of  a  contact  form  p  with  point  of  contact  x  under  the 
action  of  a  contact  diffeomorphism  /  of  the  contact  manifold  to  itself  is 

the  form 

f.P  =  C/JW_1p- 

In  simple  terms,  we  carry  the  form  p  from  the  tangent  space  at  the  point  x 
to  the  tangent  space  at  /(x)  using  the  diffeomorphism  /  (whose  derivative  at 
x  determines  an  isomorphism  between  these  two  tangent  spaces).  The  form 
fp  is  a  contact  form  since  the  diffeomorphism  /  is  a  contact  diffeomorphism. 

Theorem.  The  mapping  f  defined  above  of  the  symplectification  of  a  contact 
manifold  to  itself  is  a  symplectic  diffeomorphism  which  commutes  with  the 
action  of  the  multiplicative  group  of  real  numbers  and  preserves  the  canonical 
\-form  on  the  symplectification. 
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Proof.  The  assertion  of  the  theorem  follows  from  the  fact  that  the  canonical  1-form,  the  symp- 
lectic  2-form,  and  the  action  of  the  group  of  real  numbers  are  all  determined  by  the  contact 
structure  itself  (for  their  construction  we  did  not  use  coordinates  or  any  other  noninvariant 
tools),  and  the  diffeomorphism  f  preserves  the  contact  structure.  It  follows  from  this  that  f 
preserves  all  that  which  was  invariantly  constructed  using  the  contact  structure,  in  particular 
the  1-form  a,  its  derivative  da,  and  the  action  of  the  group. 


Theorem.  Every  symplectic  diffeomorphism  of  the  symplectification  of  a  contact 
manifold  which  commutes  with  the  action  of  the  multiplicative  group  (1) 
projects  onto  the  underlying  contact  manifold  as  a  contact  diffeomorphism 
and  (2)  preserves  the  canonical  1  -form  a. 


Proof.  Every  diffeomorphism  which  commutes  with  the  action  of  the  multiplicative  group 
projects  onto  some  diffeomorphism  of  the  contact  manifold.  To  show  that  this  is  a  contact 
diffeomorphism  it  is  sufficient  to  prove  the  second  assertion  of  the  theorem  (since  only  those 
vectors  for  which  a(£)  =  0  project  onto  the  contact  plane). 

To  prove  the  second  assertion  we  express  the  integral  of  the  form  along  any  path  y  in  terms 
of  the  symplectic  structure  da  : 


m 

J 


do c, 


where  the  2-chain  er(e)  is  obtained  from  y  by  multiplication  by  all  numbers  in  the  interval  [e,  1], 
The  boundary  of  a  contains,  besides  y,  two  vertical  intervals  and  the  path  ey.  The  integrals  of  a 
over  the  vertical  intervals  are  equal  to  zero,  and  the  integral  over  ey  approaches  0  as  e  does. 

Now  from  the  invariance  of  the  2-form  da  and  the  commutativity  of  our  diffeomorphism  F 
with  multiplication  by  numbers  it  follows  that  for  any  path  y 


and  thus  the  diffeomorphism  F  preserves  the  1-form  a. 


□ 


Definition.  The  symplectification  of  a  contact  vector  field  is  defined  by  the 
following  construction.  Consider  the  field  as  a  velocity  field  of  a  one- 
parameter  group  of  contact  diffeomorphisms.  Symplectify  the  difleomor- 
phisms.  Consider  the  velocity  field  of  this  group.  It  is  called  the  sym¬ 
plectification  of  the  original  field. 

Theorem.  The  symplectification  of  a  contact  vector  field  is  a  hamiltonian  vector 
field.  The  hamiltonian  can  be  chosen  to  be  homogeneous  of  first  order  with 
respect  to  the  action  of  multiplication  by  the  group  of  real  numbers : 

H(Xx)  =  W(x). 

Conversely,  every  hamiltonian  field  on  a  symplectified  contact  manifold, 
having  a  hamiltonian  which  is  homogeneous  of  degree  1,  projects  onto  the 
underlying  contact  manifold  as  a  contact  vector  field. 
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Proof.  The  fact  that  symplectifications  of  contact  diffeomorphisms  are 
symplectic  implies  that  the  symplectification  of  a  contact  field  is  hamil- 
tonian.  The  homogeneity  of  the  hamiltonian  follows  from  the  homogeneity  of 
symplectic  diffeomorphisms  (from  commutativity  with  multiplication  by  A). 
Thus  the  first  assertion  of  the  theorem  follows  from  the  theorem  on  sym¬ 
plectifications  of  contact  diffeomorphisms.  The  second  part  follows  in  the 
same  wav  from  the  theorem  on  homogeneous  symplectic  diffeomorphisms. 

□ 

Corollary.  Symplectification  of  vector  fields  is  an  isomorphic  map  of  the  Lie 
algebra  of  contact  vector  fields  onto  the  Lie  algebra  of  all  locally  hamiltonian 
vector  fields  with  hamiltonians  which  are  homogeneous  of  degree  1. 

The  proof  is  clear. 

H  Darboux's  theorem  for  contact  structures 

Darboux’s  theorem  is  a  theorem  on  the  local  uniqueness  of  a  contact  struc¬ 
ture.  It  can  be  formulated  in  any  of  the  following  three  ways. 

Theorem.  All  contact  manifolds  of  the  same  dimension  are  locally  contact 
dijfeomorphic  (i.e.,  there  is  a  dijfeomorphism  of  a  sufficiently  small  neighbor¬ 
hood  of  any  point  of  one  contact  manifold  onto  a  neighborhood  of  any  point 
of  the  other  which  carries  the  noted  point  of  the  first  neighborhood  to  the 
noted  point  of  the  second  and  the  field  of  planes  in  the  first  neighborhood  to 
the  field  of  planes  in  the  second ). 

Theorem.  Every  contact  manifold  of  dimension  2m  -  1  is  locally  contact 
dijfeomorphic  to  the  manifold  of  contact  elements  of  m-dimensional  space. 

Theorem.  Every  differential  1  -form  defining  a  nondegenerate  field  of  hyper  planes 
on  a  manifold  of  dimension  In  +  1,  can  be  written  in  some  local  coordinate 
system  in  the  “  normal  form  ” 

co  =  x  dy  +  dz, 

where  x  =  (x1?  . . . ,  x„),  y  =  (yu  ...,  y„)  and  z  are  the  local  coordinates. 

It  is  clear  that  the  first  two  theorems  follow  from  the  third.  We  will  deduce 
the  third  one  from  an  analogous  theorem  of  Darboux  on  the  normal  form  of 
the  2-form  giving  a  symplectic  structure  (cf.  Section  43). 

Proof  of  Darboux’s  theorem.  We  symplectify  our  manifold.  On  this  new  (2 n  +  2)-dimensional 
symplectic  manifold  there  are  a  canonical  1-form  a,  a  nondegenerate  2-form  da,  a  projection  n 
onto  the  underlying  contact  manifold  and  a  vertical  direction  at  every  point. 

The  given  differential  1-form  to  on  the  contact  manifold  defines  a  contact  form  at  every 
point.  These  contact  forms  form  a  (2 n  +  l)-dimensional  submanifold  of  the  symplectic  mani¬ 
fold.  The  projection  n  maps  this  submanifold  dilfeomorphically  onto  the  underlying  contact 
manifold,  and  the  verticals  intersect  this  submanifold  at  a  nonzero  angle. 
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Consider  a  point  in  the  surface  just  constructed  (in  the  symplectic  manifold)  lying  over  the 
point  of  the  contact  manifold  we  are  interested  in.  In  the  symplectic  manifold  we  can  choose  a 
local  system  of  coordinates  near  this  point  such  that 

dx  =  dp0  a  dq0  +  •  ■  ■  +  clpn  a  dqn 

and  such  that  the  coordinate  surface  p0  =  0  coincides  with  our  (2 n  +  1  (-dimensional  manifold 
(cf.  Section  43,  where  in  the  proof  of  the  symplectic  Darboux’s  theorem  the  first  coordinate  may 
be  chosen  arbitrarily). 

We  note  now  that  the  1-form  p0  dq0  +  ■  ■  ■  p„  dq„  has  derivative  i/x  Thus,  locally, 

*  =  Po  dq0  +  ■  ■  ■  +  p„  dq„  +  dw, 

where  w  is  a  function  which  can  be  taken  to  be  zero  at  the  origin.  In  particular,  on  the  surface 
Po  =  0  the  form  a  takes  the  form 

aL  =  o  =  Pi  4(/i  +  •■  ■  +  p„  dq„  +  dw. 

The  projection  u  allows  us  to  carry  the  coordinates  p±,  . . . ,  p„- q0  :  qu  . . . ,  qn  and  the  function 
n  onto  the  contact  manifold.  More  precisely,  we  define  functions  x,  y.  and  z  by  the  formulas 

xlnA)  =  Pi(A)  y,{nA)  =  q,{A)  :(nA)  =  wiA), 

where  A  is  a  point  on  the  surface  p0  =  0. 

Then  we  obtain 


(0  =  x  dy  +  d: 

and  it  remains  only  to  verify  that  the  functions  (X[, . . . ,  x„;  . . . ,  v„;  r)  form  a  coordinate 

system.  For  this  it  is  sufficient  to  verify  that  the  partial  derivative  of  vv  with  respect  to  q0  is  not 
zero,  or  in  other  words  that  the  1-form  a  is  not  zero  on  a  vector  of  the  coordinate  direction  <j0. 
The  latter  is  equivalent  to  the  2-form  dx  being  nonzero  on  the  pair  of  vectors:  the  basic  vector 
in  the  direction  of  q0  and  the  vertical  vector. 

But  a  vector  in  the  coordinate  direction  q0  is  skew-orthogonal  to  all  vectors  of  the  coordinate 
plane  p0  =  0.  If  it  was  also  skew-orthogonal  to  the  vertical  vector,  then  it  would  be  skew- 
orthogonal  to  all  vectors,  which  contradicts  the  nondegeneracy  of  dx.  Thus  cw/dq0  #  0,  and  the 
theorem  is  proved.  0 

I  Contact  hamiltonians 

Suppose  that  the  contact  structure  of  a  contact  manifold  is  given  by  a  dif¬ 
ferential  1-form  to,  and  that  this  form  is  fixed. 

Definition.  The  to-embedding  of  the  contact  manifold  into  its  symplectification 
is  the  map  associating  to  a  point  of  the  contact  manifold  the  restriction  of 
the  form  to  on  the  tangent  plane  at  this  point. 

Definition.  The  contact  hamiltonian  function  of  a  contact  vector  field  on  a 
contact  manifold  with  fixed  1-form  to  is  the  function  K  on  the  contact 
manifold  whose  value  at  each  point  is  the  value  of  the  homogeneous 
hamiltonian  H  of  the  symplectification  of  the  field  on  the  image  of  the 
given  point  under  the  co-embedding: 

K(A )  =  H(to\A). 
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Theorem.  The  contact  Hamiltonian  function  K  of  a  contact  vector  field  X  on  a 
contact  manifold  with  a  given  1  -form  co  is  equal  to  the  value  of  the  form  o 
on  this  contact  field : 

K  =  ofiX). 

Proof.  We  use  the  expression  for  the  increment  of  the  ordinary  hamiltonian  function  over  a 
path  in  terms  or  the  vector  field  and  the  symplectic  structure  (Section  48,  C).  For  this  we  draw  a 
vertical  interval  [XB\,  0  <  A  <  1.  through  the  point  B  of  the  symplectification  at  which  we 
want  to  calculate  the  hamiltonian  function.  The  translations  of  this  interval  over  small  time  r 
under  the  action  of  the  symplectified  flow  defined  by  our  field  X.  fill  out  a  two-dimensional 
region  <r( t).  The  value  of  the  hamiltonian  at  the  point  B  is  equal  to  the  limit 

H(B)  =  lim  r  1  4a, 

T  ~ *  O  *^'T It) 

since  H(XB)  -»  0  as  X  -*■  0.  But  the  integral  of  the  form  da  over  the  region  is  the  integral  of 
the  1  -form  a  along  the  edge  formed  by  the  trajectory  of  the  point  B(the  other  parts  of  the  boundary 
give  zero  integrals).  Therefore,  the  double  integral  is  simply  the  integral  of  the  1-form  a  along 
the  interval  of  trajectories,  and  the  limit  is  the  value  of  a  on  the  velocity  vector  Y  of  the  symplec¬ 
tified  field.  Thus  K(nB)  =  H(B)  =  x(T)  =  w(X),  as  was  to  be  shown.  □ 

J  Computational  formulas 

Suppose  now  that  we  make  use  of  the  coordinates  in  Darboux  s  theorem  in 
which  the  form  co  has  the  normal  form 

c o  =  x  dy  +  dz,  x  —  (xls . . . ,  x„),  y  =  Cv1?  •  •  • ,  }V)- 

Problem.  Find  the  components  of  the  contact  field  with  a  given  contact 
hamiltonian  function  K  =  K(x,  y,  z). 

Answer.  The  equations  of  the  contact  flow  have  the  form 

x  =  —  Ky  4-  xKz 

■  y  =  Kx 

\z  =  K  —  xKx. 

Solution.  A  point  of  the  symplectification  can  be  given  by  the  2n  +  2  numbers  .vf,  r, 
and  A,  where  fix,  v,  r)  are  the  coordinates  of  a  point  of  the  contact  manifold  and  X  is  the  number 
by  which  we  must  multiply  oj  to  obtain  the  given  point  of  the  symplectified  space. 

In  these  coordinates  a  =  Xx  dy  +  A  dz.  Therefore,  in  the  coordinate  system  p,  q,  where 

p  =  (P.  Pol  P  =  ^  Po  =  * 

q  =  (q,  qoX  4  =  T  4o  = 

the  form  a  takes  the  standard  form: 

a  =  p  4q  da  =  dp  a  4q. 

The  action  T„  of  the  multiplicative  group  is  now  reduced  to  multiplication  of  p  by  a  number 
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The  contact  hamiltonian  K  can  be  expressed  in  terms  of  the  ordinary  hamiitonian 
H  =  H(p,  q ,  p0,  q0)  by  the  formula 

K(x.  r,  r)  -  fl(x.  y.  L  :), 

The  function  H  is  homogeneous  of  degree  1  in  p.  Therefore,  the  partial  derivatives  of  K  at  the 
point  (.v,  y,  ;)  are  related  to  the  derivatives  of  H  at  the  point  (p  =  a,  p0  =  1.  q  =  y,  q0  =  r)  by 
the  relations 

HMo  =  K_, 

Hp  =  K,  Hpo  =  K  -  \KX 

Hamilton  s  equations  with  hamiltonian  function  H  therefore  have  the  following  form  at  the 
point  under  consideration: 

i  +  av.  =  -Kv  ).=  KZ, 

=  Kx,  i  =  K  —  xKx. 
from  which  we  obtain  the  answer  above. 


Problem.  Find  the  contact  hamiltonian  of  the  Poisson  bracket  of  two  contact 
fields  with  contact  hamiltonians  K  and  K'. 

Answer.  ( K ,  K ')  +  KZEK'  —  K'ZEK,  where  the  brackets  denote  Poisson 
bracket  in  the  variables  x  and  y  and  E  is  the  Euler  operator  EF  =  F  -  xFx. 

Solution.  In  the  notation  of  the  solution  of  the  preceding  problem  we  must  express  the 
ordinary  Poisson  bracket  of  the  homogeneous  hamiltonians  H  and  H'  at  the  point 
(P  =  A-  Po  =  1.  =  v,  c0  =  r)  in  terms  of  the  contact  hamiltonians  K  and  K‘.  We  have 

<«.  H  ) .  -  w,h;  -  h,h;  -  h,h;  +  -  hkh,. 

Substituting  the  values  of  the  derivatives  from  Ihe  preceding  problem,  we  find  at  the  point  under 

consideration 

(//,  H  )  =  KVK\  -  KxKy  +  KZ(K  -  xK'J  -  K'Z(K  -  xKx). 

K  Legendre  manifolds 

The  lagrangian  submanifolds  of  a  symplectic  phase  space  correspond  in  the 
contact  case  to  an  interesting  class  of  manifolds  which  may  be  called  Legendre 
manifolds  since  they  are  closely  related  to  Legendre  transformations. 

Definition.  A  Legendre  submanifold  of  a  (2 n  +  l)-dimensional  contact  mani¬ 
fold  is  an  M-dimensional  integral  manifold  of  the  field  of  contact  planes. 

In  other  words,  it  is  an  integral  manifold  of  the  highest  possible  dimension 
for  a  nondegenerate  field  of  planes. 

Example  1.  The  set  of  all  contact  elements  tangent  to  a  submanifold  of  any 
dimension  in  an  m-dimensional  manifold  is  an  ( m  -  l)-dimensional  Legendre 
submanifold  of  the  (2m  -  l)-dimensional  contact  manifold  of  all  contact 
elements. 
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Example  2.  The  set  of  all  planes  tangent  to  the  graph  of  a  function  /  =  (p{x) 
in  an  (n  +  l)-dimensional  euclidean  space  with  coordinates  (xl5  . . . ,  x„;  /) 
is  a  Legendre  submanifold  of  the  (2 n  -f  l)-dimensional  space  of  all  non¬ 
vertical  hyperplane  elements  in  the  space  of  the  graph  (the  contact  structure 
is  given  by  the  1  -form 

co  —  p1  dx1  +  •  •  •  4-  pn  dx„  —  df, 

the  element  with  coordinates  (p,  x,  /)  passes  through  the  point  with  co¬ 
ordinates  (x,/)  parallel  to  the  plane  f=  piXj  +  •  ■  •  +  p„x„). 

The  Legendre  transformation  can  be  described  in  these  terms  in  the 
following  way. 

Consider  a  second  (2 n  +  l)-dimensional  contact  space  with  coordinates 
(P,  X,  F )  and  contact  structure  given  by  the  form 

Q  -  P  dX  -  dF. 

The  Legendre  involution  is  the  map  taking  a  point  of  the  first  space  with 
coordinates  (p,  x,  /)  to  the  point  of  the  second  space  with  coordinates 

p  =  x  X  =  p  F  =  px  -  f 

The  Legendre  involution,  as  can  be  easily  calculated,  carries  the  first 
contact  structure  to  the  second.  Clearly,  we  have 

Theorem.  A  diffeomorphism  of  one  contact  manifold  onto  another  which  carries 

contact  planes  to  contact  planes,  carries  every  Legendre  manifold  to  a 

Legendre  manifold. 

In  particular,  under  the  action  of  the  Legendre  involution  the  Legendre 
manifold  of  plane  elements  tangent  to  the  graph  of  a  function  is  carried  into  a 
new  Legendre  manifold.  This  new  manifold  is  called  the  Legendre  transform 
of  the  original  manifold. 

The  projection  of  the  new  manifold  onto  the  space  with  coordinates  (X,  F) 
(parallel  to  the  P-direction)  is  in  general  not  a  smooth  manifold,  but  has 
singularities.  This  projection  is  called  the  Legendre  transform  of  the  graph  of 
the  function  cp. 

If  the  function  cp  is  convex,  then  the  projection  is  itself  the  graph  of  a 
function  F  =  <P(20.  In  this  case  <P  is  called  the  Legendre  transform  of  the 
function  cp. 

As  another  example  we  consider  the  motion  of  oriented  contact  elements 
under  the  action  of  the  geodesic  flow  on  a  riemannian  manifold.  As  the 
“initial  wave  front”  we  take  some  smooth  submanifold  of  our  riemannian 
manifold  (the  dimension  of  the  submanifold  is  arbitrary).  The  oriented  con¬ 
tact  elements  tangent  to  this  submanifold  form  a  Legendre  manifold  in  the 
space  of  all  contact  elements.  From  the  preceding  theorem  we  obtain 
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Corollary.  The  family  of  all  elements  tangent  to  a  wavefront  is  transformed 
under  the  action  of  the  geodesic  flow  after  time  t  to  a  Legendre  manifold  of 
the  space  of  all  contact  elements. 

It  should  be  noted  that  this  new  Legendre  manifold  may  not  be  the  family 
of  all  elements  tangent  to  some  smooth  manifold,  since  a  wave  front  may 
develop  singularities. 

The  Legendre  singularities  which  arise  in  this  way  can  be  described  in  a 
manner  similar  to  lagrangian  singularities  (cf.  Appendix  12).  A  Legendre 
fibration  of  a  (2 n  +  l)-dimensional  contact  manifold  is  a  fibration  all  of 
whose  fibers  are  rc-dimensional  Legendre  manifolds.  A  Legendre  singularity 
is  a  singularity  of  the  projection  of  an  w-dimensional  Legendre  submanifold 
of  a  (2 n  +  l)-dimensional  contact  manifold  onto  the  (n  +  l)-dimensional 
base  of  the  Legendre  fibration. 

Consider  the  space  IR2n+1  with  contact  structure  given  by  the  form 
a  =  x  dy  +  dz,  where  x  =  (xl5 . . . ,  xj  and  y  =  (yl5  . . . ,  y„).  The  projection 
(x,  y,  z )  -*■  (y,  z)  gives  a  Legendre  fibration. 

An  equivalence  of  Legendre  fibrations  is  a  diffeomorphism  of  the  total 
spaces  of  the  fibrations  carrying  the  contact  structure  and  fibers  of  the  first 
bundle  to  the  contact  structure  and  fibers  of  the  second  bundle.  It  can  be 
shown  that  every  Legendre  bundle  is  equivalent  to  the  special  bundle  just 
described  in  a  neighborhood  of  every  point  of  the  space  of  the  bundle. 

The  contact  structure  of  the  total  space  of  fibration  gives  the  fibers  a  local 
structure  of  a  projective  space.  Legendre  equivalence  preserves  this  structure, 
i.e.,  defines  locally  projective  fiber  transformations. 

The  following  theorem  allows  us  to  locally  describe  Legendre  sub¬ 
manifolds  and  maps  by  using  generating  functions. 


Theorem.  For  any  partition  I  +  J  of  the  set  of  indices  (1, . . . ,  n)  into  two  dis¬ 
joint  subsets  and  for  any  function  S(xj,  yj )  of  n  variables  xi?  i  el,j  e  J,  the 
formulas 


BS  dS  dS 

yr  —  ^  XJ  —  ~  3  z  —  S  —  X[  - 
oxj  @yj  ox i 


define  a  Legendre  submanifold  of  R2n+1.  Conversely,  every  Legendre  sub¬ 
manifold  of  [R2n+ 1  is  defined  in  a  neighborhood  of  every  point  by  these  formulas 
for  at  least  one  of  the  2"  possible  choices  of  the  subset  I. 


The  proof  is  based  on  the  fact  that,  on  a  Legendre  manifold,  dz  +  xdy  =  0, 
so  d(z  +  X/yr)  =  y,  dxf  -  Xj  dy}.  □ 

In  the  formulas  of  the  preceding  theorem,  we  replace  S  by  a  function  from 
the  list  of  the  simple  lagrangian  singularities  given  in  Appendix  12.  We 
obtain  Legendre  singularities  which  are  preserved  under  small  deformations 
of  the  Legendre  mapping  (x,  y,  z)  ->•  (y,  z)  (i.e.,  are  carried  to  equivalent 
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singularities  for  small  deformations  of  the  function  S).  Every  Legendre 
mapping  for  n  <  6  can  be  approximated  by  a  map,  all  of  whose  singularities 
are  locally  equivalent  to  singularities  from  the  list  Ak  (1  <  k  <  6),  Dk 
(4  <  k  <  6),  E6. 

In  particular,  we  obtain  a  list  of  the  singularities  of  a  wave  front  in  general 
position  in  spaces  of  dimension  less  than  7. 

In  ordinary  three-space  this  list  is  as  follows : 

/4i :  S  =  ±x\  A2:  S  =  ±xl  A3:  S  =  ±x?  +  xjy2 

where  /  -  {1},  J  =  {2},  and  n  —  2. 

The  projections  of  the  Legendre  manifolds  indicated  here  onto  the  base 
of  the  Legendre  bundle  (i.e.,  onto  the  space  with  coordinates  yu  y2,  and  z) 
are:  a  simple  point  in  the  case  of  Au  a  cuspidal  edge  in  the  case  of  A2,  and  a 
swallowtail  (cf.  Figure  246)  in  the  case  of  A$. 

Thus  a  wave  front  in  general  position  in  three-space  has  only  cusps  and 
“swallowtail”  points  as  singularities.  At  isolated  moments  of  time  during  the 
motion  of  the  front  we  can  observe  transitions  of  the  three  types  A4,  D4  and 
D4  (cf.  Appendix  12,  where  the  corresponding  caustics  filled  out  by  the 
singularities  of  the  front  during  its  motion  are  drawn). 


Problem  1.  Lay  out  an  interval  oflength  r  on  every  interior  normal  to  an  ellipse  in  the  plane. 
Draw  the  curve  obtained  and  investigate  its  singularities  and  its  transitions  as  t  changes. 


Problem  2.  Do  the  same  thing  for  a  triaxial  ellipsoid  in  three-dimensional  space. 


L  Contactification 

Along  with  symplectification  of  contact  manifolds,  there  is  a  contactification 
of  symplectic  manifolds  with  symplectic  structure  cohomologous  to  zero. 

The  contactification  E2n+1  of  the  symplectic  manifold  (M2n,  co2)  is  con¬ 
structed  as  the  space  of  a  bundle  with  fiber  U  over  M2".  Let  U  be  a  sufficiently 
small  neighborhood  of  a  point  x  in  M,  so  that  there  is  a  canonical  coordinate 
system  p,  q  on  U  with  co  =  dp  a  dq.  Consider  the  direct  product  U  x  03 
with  coordinates  p,  q,  z.  Let  V  x  !R  be  the  same  kind  of  product  constructed 
on  another  (or  the  same)  neighborhood  V,  with  coordinates  P,Q,Z;  dP  a  dQ 
=  co.  If  the  neighborhoods  U  and  V  on  M  intersect,  then  we  identify  the 
fibers  above  the  points  of  intersection  in  both  representations  so  that  the 
form  dz  +  p  dq  =  dZ  +  P  dQ  =  a  is  defined  on  the  whole  (this  is  possible 
since  P  dQ  —  p  dq  is  a  total  differential  on  U  n  V). 

It  is  easy  to  verify  that  after  this  pasting  together  we  have  a  bundle  E2n+ 1 
on  M2"  and  that  the  form  a  defines  a  contact  structure  on  E.  The  manifold  E 
is  called  the  contactification  of  the  symplectic  manifold  M.  If  the  cohomology 
class  of  the  form  co2  is  integral,  then  we  can  define  a  contactification  with 
fiber  S1. 
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M  Integration  of  first-order  partial  differential  equations 

Let  M2n  + 1  be  a  contact  manifold,  and  Eln  a  hypersurface  in  M2n+1.  The 
contact  structure  on  M  defines  some  geometric  structure  on  E— in  particular, 
the  field  of  so-called  characteristic  directions.  An  analysis  of  this  geometric 
structure  can  reduce  the  integration  of  general  first-order  nonlinear  partial 
differential  equations  to  the  integration  of  a  system  of  ordinary  differential 
equations. 

We  assume  that  the  manifold  £2"  is  transverse  to  the  contact  planes  at  all 
its  points.  In  this  case,  the  intersection  of  the  tangent  plane  to  E2”  at  each  of  its 
points  with  the  contact  plane  has  dimension  2 n  —  1,  so  that  we  have  a  field 
of  hyperplanes  on  E2n.  Furthermore,  the  contact  structure  on  M2n+1  defines 
on  E2n  a  field  of  lines  lying  in  these  (2 n  —  l)-dimensionaI  planes. 

In  fact,  let  a  be  a  1-form  on  M2n  +  1  locally  giving  the  contact  structure; 
let  o)  —  dot  and  let  IR2n  be  a  contact  plane  at  the  point  x  in  E2n.  Let  0  —  0 
be  the  local  equation  of  E2"  (so  d<t>  is  not  zero  at  x).  The  restriction  of  dO  to 
IR2"  defines  a  nonzero  linear  form  on  IR2".  The  2-form  co  gives  1R2”  the  structure 
of  a  symplectic  vector  space  and  thus  an  isomorphism  of  this  space  with  its 
dual.  The  nonzero  1-form  corresponds  to  a  nonzero  vector  £  of  IR2", 

so  that  d<P()  =  aft;,  ■).  The  vector  £  is  called  the  characteristic  vector  of  the 
manifold  E2"  at  the  point  x.  The  characteristic  vector  £  lies  in  the  inter¬ 
section  of  IR2"  with  the  tangent  plane  to  E2n,  so  that  =  0. 

The  vector  £  is  not  uniquely  defined  by  the  manifold  E2n  and  the  contact 
structure  on  M,  but  only  up  to  multiplication  by  a  nonzero  number.  In  fact, 
like  the  2-form  a)  on  R2",  the  1-form  dd>  on  R2"  is  defined  only  up  to  multi¬ 
plication  by  a  nonzero  number. 

The  direction  of  the  characteristic  vector  (i.e,,  the  line  containing  it)  is 
determined  uniquely  by  the  contact  structure  at  every  point  of  the  manifold 
E,  Thus  we  have  a  field  of  characteristic  directions  on  the  hypersurface  E  of 
the  contact  manifold  M.  The  integral  curves  of  this  field  of  directions  are 
called  the  characteristics. 

Now  suppose  we  are  given  an  (n  —  l)-dimensional  submanifold  /  of  our 
hypersurface  E2n,  which  is  integral  for  the  contact  field  (so  that  the  tangent 
plane  to  /  at  each  point  is  contained  in  the  contact  plane). 

Theorem.  If  at  a  point  x  of  I  the  characteristic  on  E2n  is  not  tangent  to  /,  then 

in  a  neighborhood  of  the  point  x  the  characteristics  on  E2n  passing  through 

points  of  I  form  a  Legendre  submanifold  Ln  in  M2n+1. 

Proof.  Let  £  be  a  vector  field  on  E2n  made  up  of  characteristic  vectors.  By 
the  homotopy  formula  (cf.  Section  36G)  we  have  on  E2n 

L^a  =  di^a  -I-  dot. 

But  i^a  =  0  since  the  characteristic  vector  belongs  to  the  contact  plane. 
Therefore,  on  E2n  we  have  L<a  =  i^co.  But  the  1-form  i^co  is  zero  on  the 
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intersection  of  the  tangent  plane  to  E2n  with  the  contact  plane  (since  on  the 
contact  plane  =  d<P,  and  on  the  tangent  plane  d<t>  =  0).  Therefore,  on 
the  tangent  plane  to  E2n  we  have  iico  =  ca.  Thus  on  the  hypersurface  E, 

L^Gt  =  cct 

(where  c  is  a  function  smooth  in  a  neighborhood  of  x). 

Now  let  { g (}  be  the  (local)  phase  flow  of  the  field  £  and  t]  a  vector  tangent 
to  E2n.  Set  t](t)  =  g^g  and  y(t)  =  a(r/(f)).  Then  the  function  y  satisfies  the 
linear  differential  equation 

™  =  c(t)y(t). 
at 

If  tj( 0)  is  tangent  to  /,  then  y(0)  -  a(^/(0))  =  0.  This  means  y(t)  =  oc(tj(t)) 
=  0,  i.e.,  for  all  t,  rj(t)  lies  in  the  contact  plane.  Therefore,  g'l  is  an  integral 
manifold  of  the  contact  field.  Therefore  the  manifold  formed  by  all  {g‘I}  for 
small  t  is  a  Legendre  manifold.  D 

Examplk.  Consider  R2"+ 1  with  coordinates  x,,  . . . ,  x„;  pu  . . . ,  p„;  u  with 
contact  structure  defined  by  the  1-form  a  =  du  -  p  dx.  A  function  0(x,  p,  u ) 
defines  a  differential  equation  0(x,  du/dx,  u)  =  0  and  a  submanifold  E  — 

‘(0)  in  the  space  R2"+ 1  (called  the  space  of  \-jets  of  functions  on  R"). 

An  initial  condition  for  the  equation  O  =  0  is  an  assignment  of  a  value  / 
to  the  function  u  on  an  ( n  —  l)-dimensional  hypersurface  T  in  the  n-dimen- 
sional  space  with  coordinates  x1? . . . ,  x„. 

An  initial  condition  determines  the  derivatives  of  u  in  the  n  —  1  indepen¬ 
dent  directions  at  each  point  of  T.  The  derivative  in  a  direction  transverse  to 
T  can  generally  be  found  from  the  equation;  if  the  conditions  of  the  implicit 
function  theorem  are  fulfilled,  then  the  initial  condition  is  called  noncharacter¬ 
istic. 

A  noncharacteristic  initial  condition  defines  an  (n  —  l)-dimensional  inte¬ 
gral  submanifold  /  of  the  form  a  (the  graph  of  the  mapping  u  =  /(x),  p  =  p(x), 
x  g  T).  The  characteristics  on  E  intersecting  1  form  a  Legendre  submanifold 
of  [R2n  +  I,  the  graph  of  the  mapping  u  =  u(x),  p  —  du/dx.  The  function  u(x) 
is  a  solution  of  the  equation  d>(x,  du/dx ,  u)  =  0  with  initial  condition  w|r  =  f 

Note  that  to  find  the  function  u  we  need  only  solve  the  system  of  2 n  first- 
order  ordinary  differential  equations  for  the  characteristics  on  E,  and  perform 
a  series  of  “algebraic”  operations. 
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By  the  theorem  of  E.  Noether,  one-parameter  groups  of  symmetries  of  a 
dynamical  system  determine  first  integrals.  If  a  system  admits  a  larger  group 
of  symmetries,  then  there  are  several  integrals.  Simultaneous  level  manifolds 
of  these  first  integrals  in  the  phase  space  are  invariant  manifolds  of  the  phase 
flow.  The  subgroup  of  the  group  of  symmetries  mapping  such  an  invariant 
manifold  into  itself  acts  on  the  manifold.  In  many  cases,  we  can  look  at  the 
quotient  manifold  of  an  invariant  manifold  by  this  subgroup.  This  quotient 
manifold,  called  the  reduced  phase  space,  has  a  natural  symplectic  structure. 
The  original  hamiltonian  dynamical  system  induces  a  hamiltonian  system 
on  the  reduced  phase  space. 

The  partition  of  the  phase  space  into  simultaneous  level  manifolds 
generally  has  singularities.  An  example  is  the  partition  of  a  phase  plane  into 
energy  level  curves. 

In  this  appendix  we  will  briefly  discuss  dynamical  systems  in  reduced 
phase  space  and  their  relationship  with  invariant  manifolds  in  the  original 
space.  All  these  questions  were  investigated  by  Jacobi  and  Poincare  (“elimin¬ 
ation  of  the  nodes”  in  the  many-body  problem,  “reduction  of  order”  in 
systems  with  symmetries,  “stationary  rotations”  of  rigid  bodies,  etc.).  A 
detailed  presentation  in  current  terminology  can  be  found  in  the  following 
articles:  S.  Smale,  “Topology  and  mechanics,”  Inventiones  Mathematicae 
10:4  (1970)  305-331,  11:1  (1970),  45  64;  and  J.  Marsden  and  A.  Weinstein, 
“Reduction  of  symplectic  manifolds  with  symmetries,”  Reports  on  Mathe¬ 
matical  Physics  5  (1974)  121  130. 

A  Poisson  action  of  Lie  groups 

Consider  a  symplectic  manifold  (M2n,  co2)  and  suppose  a  Lie  group  G  acts 
on  it  as  a  group  of  symplectic  diffeomorphisms.  Every  one-parameter  sub¬ 
group  of  G  then  acts  as  a  locally  hamiltonian  phase  flow  on  M.  In  many 
important  cases,  these  flows  have  single-valued  hamiltonian  functions. 


Example.  Let  V  be  a  smooth  manifold  and  G  some  Lie  group  of  diffeomorphisms  of  V.  Since 
every  diffeomorphism  takes  1 -forms  on  V  to  1 -forms,  the  group  G  acts  on  the  cotangent  bundle 
M  =  T*  V. 

Recall  that  on  the  cotangent  bundle  there  is  always  a  canonical  1-form  x  (“ pclq ")  and  a 
natural  symplectic  structure  w  =  dx.  The  action  of  the  group  G  on  M  is  symplectic  since  it 
preserves  the  1-form  x  and  hence  also  the  2-form  dx. 

A  one-parameter  subgroup  {g1}  of  G  defines  a  phase  flow  on  M.  It  is  easy  to  verify  that  this 
phase  flow  has  a  single-valued  hamiltonian  function.  In  fact,  the  hamiltonian  function  is  given 
by  the  formula  from  Noether’s  theorem: 


H(x)  =  a 


'd 

ft 


y‘x 

i  -  0  > 


where  ,v  6  M. 


We  now  assume  that  we  are  given  a  symplectic  action  of  a  Lie  group  G 
on  a  connected  symplectic  manifold  M  such  that,  to  every  element  a  of  the 
Lie  algebra  of  G,  there  corresponds  a  one-parameter  group  of  symplectic 
diffeomorphisms  with  a  single-valued  hamiltonian  Ha.  These  hamiltonians 
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are  determined  up  to  the  addition  of  constants  which  can  be  chosen  so  that 
the  dependence  of  Ha  upon  a  is  linear.  To  do  this,  it  is  sufficient  to  choose 
arbitrarily  the  constants  in  the  hamiltonians  for  a  set  of  basis  vectors  of  the 
Lie  algebra  of  G,  and  to  then  define  the  hamiltonian  function  for  each  element 
of  the  algebra  as  a  linear  combination  of  the  basis  functions. 

Thus,  given  a  symplectic  action  of  a  Lie  group  G  and  a  single-valued 
hamiltonian  on  M,  we  can  construct  a  linear  mapping  of  the  Lie  algebra  of 
G  into  the  Lie  algebra  of  hamiltonian  functions  on  M.  The  function  H[ab] 
associated  to  the  commutator  of  two  elements  of  the  Lie  algebra  is  equal  to 
the  Poisson  bracket  ( Ha ,  Hb\  or  else  it  differs  from  this  Poisson  bracket  by  a 
constant : 


H[aM  =  {Ha,Hb)  +  C{a,  b). 

Remark.  The  appearance  of  the  constant  C  in  this  formula  is  a  consequence  of  an  interesting 
phenomenon:  the  existence  of  a  two-dimensional  cohomology  class  of  the  Lie  algebra  of 
(globally)  hamiltonian  fields. 

The  quantity  C(a,  b)  is  a  bilinear  skew-symmetric  function  on  the  Lie  algebra.  The  Jacobi 
identity  gives  us 


C([«,  /)],  c)  +  C([b,  r],  a)  +  C([c,  a],  b)  =  0. 

A  bilinear  skew-symmetric  function  on  a  Lie  algebra  with  this  property  is  called  a  two-dimensional 
cocycle  of  the  Lie  algebra. 

If  we  choose  the  constants  in  the  hamiltonian  functions  differently,  then  the  cocycle  C  is 
replaced  by  C',  where 


C'(a,  b)  =  C(a,  b )  +  p([«,  />]) 

where  p  is  a  linear  function  on  the  Lie  algebra.  Such  a  cocycle  C'  is  said  to  be  cohomologous  to 
the  cocycle  C.  A  class  of  cocycles  which  are  cohomologous  to  one  another  is  called  a  cohomology 
class  of  the  Lie  algebra. 

Thus,  a  symplectic  action  of  a  group  G  for  which  single-valued  hamiltonians  exist  defines  a 
two-dimensional  cohomology  class  of  the  Lie  algebra  of  G.  This  cohomology  class  measures 
the  deviation  of  the  action  from  one  in  which  the  hamiltonian  function  of  a  commutator  can  be 
chosen  equal  to  the  Poisson  bracket  of  the  hamiltonian  functions. 


Definition.  An  action  of  a  connected  Lie  group  on  a  symplectic  manifold  is 
called  a  Poisson  action  if  the  hamiltonian  functions  for  one-parameter 
groups  are  single-valued,  and  chosen  so  that  the  hamiltonian  function 
depends  linearly  on  elements  of  the  Lie  algebra  and  so  that  the  hamiltonian 
function  of  a  commutator  is  equal  to  the  Poisson  bracket  of  the  hamil¬ 
tonian  functions : 


H[a,b]  ~  Mai  Hb). 

In  other  words,  a  Poisson  action  of  a  group  defines  a  homomorphism  from 
the  Lie  algebra  of  this  group  to  the  Lie  algebra  of  hamiltonian  functions. 
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Example.  Let  V  be  a  smooth  manifold  and  G  a  Lie  group  acting  on  Las  a  group  of  diffeo- 
morphisms.  Let  M  =  T*  V  be  the  cotangent  bundle  of  the  manifold  V  with  the  usual  symplectic 
structure  to  =  dx.  The  hamiltonian  functions  of  one-parameter  groups  are  defined  as  above: 


(L 


H,(.v)  =  a 


d 

It 


i  =  0 


x  e  T*V. 


Theorem.  This  action  is  Poisson. 

Proof.  By  definition  of  the  1-form  a.  the  hamiltonian  functions  H ,  are  linear  “in  p"  (i.e.,  on 
every  cotangent  space).  Therefore,  their  Poisson  brackets  are  also  linear.  Thus  the  function 
H[a  bi  ~(Ha<  Hb)  is  linear  in  p.  Since  it  is  constant,  it  is  equal  to  zero.  □ 

In  the  same  way.  we  can  show  that  the  symplectification  of  any  contact  action  is  a  Poisson 
action. 


Example.  Let  V  be  three-dimensional  euclidean  space  and  G  the  six-dimensional  group  of  its 
motions.  The  following  six  one-parameter  groups  form  a  basis  of  the  Lie  algebra:  the  trans¬ 
lations  with  velocity  1  along  the  coordinate  axes  ql,q2.  and  q3  and  the  rotations  with  angular 
velocity  1  around  these  axes.  By  formula  (1),  the  corresponding  hamiltonian  functions  are  (in 
the  usual  notation)  p{,  p2.  p3;  M2.  M2,  where  M,  =  q2p3  —  q3p2 ,  etc.  The  theorem  im¬ 

plies  that  the  pairwise  Poisson  brackets  of  these  six  functions  are  equal  to  the  hamiltonian 
functions  of  the  commutators  of  the  corresponding  one-parameter  groups. 


A  Poisson  action  of  a  group  G  on  a  symplectic  manifold  M  defines  a 
mapping  of  M  into  the  dual  space  of  the  Lie  algebra  of  the  group 

P:  M  -*  g*. 

That  is,  we  fix  a  point  x  in  M  and  consider  the  function  on  the  Lie  algebra 
which  associates  to  an  element  a  of  the  Lie  algebra  the  value  of  the  Hamil¬ 
tonian  Ha  at  the  fixed  point  x  : 


px(a)  =  Ha(x). 

This  px  is  a  linear  function  on  the  Lie  algebra  and  is  the  element  of  the  dual 
space  to  the  algebra  associated  to  x: 

P(x)  =  px . 

Following  Souriau  ( Structure  des  systemes  Dynamiques,  Dunod,  1970),  we 
will  call  the  mapping  P  the  momentum.  Note  that  the  value  of  the  momentum 
is  always  a  vector  in  the  space  g*. 


Example.  Let  V  be  a  smooth  manifold,  G  a  Lie  group  acting  on  V  as  a  group  of  diffeomorphisms, 
M  =  T*V  the  cotangent  bundle  and  H„  the  hamiltonian  functions  constructed  above  of  the 
action  of  G  on  M  (cf.  ( I )). 

Then  the  “momentum''  mapping  P  M  -*  g*  can  be  described  in  the  following  way.  Con¬ 
sider  the  map  <t>:  G  ->  M  given  by  the  action  of  all  the  elements  of  G  on  a  fixed  point  x  in  M 
(so  <t> (g)  =  gx).  The  canonical  1-form  x  on  M  induces  a  1-form  <I>*x  on  G.  Its  restriction  to  the 
tangent  space  at  the  identity  of  G  is  a  linear  form  on  the  Lie  algebra. 

Thus  to  every  point  x  in  M  we  have  associated  a  linear  form  on  the  Lie  algebra.  It  is  easy 
to  verify  that  this  mapping  is  the  momentum  of  our  Poisson  action. 
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In  particular,  if  V  is  euclidean  three-space  and  G  is  the  group  of  rotations  around  the  point  0, 
then  the  values  of  the  momentum  are  the  usual  vectors  of  angular  momentum;  if  G  is  the  group 
of  rotations  around  an  axis,  then  the  values  of  the  momentum  are  the  angular  momenta  relative 
to  this  axis;  if  G  is  the  group  of  parallel  translations,  then  the  values  of  the  momentum  are  the 
vectors  of  linear  momentum. 


Theorem.  Under  the  momentum  mapping  P,  a  Poisson  action  of  a  connected 
Lie  group  G  is  taken  to  the  co-adjoint  action  of  G  on  the  dual  space  9*  of  its 
Lie  algebra  ( cf  Appendix  2),  i.e.,  the  following  diagram  commutes : 


M 


g 


-+M 


P 


P 


9* 


Ad*. 


9* 


Corollary.  Suppose  that  a  hamiltonian  function  H:  M  IR  is  invariant  under 
the  Poisson  action  of  a  group  G  on  M.  Then  the  momentum  is  a  first  integral 
of  the  system  with  hamiltonian  function  H. 

Proof  of  the  theorem  The  theorem  asserts  that  the  hamiltonian  function  Ha  of  the  one- 
parameter  group  h!  is  carried  over  by  the  diffeomorphism  g  to  the  hamiltonian  function  HAJqa 
of  the  one-parameter  group  gh'g~ 

Let  gs  be  a  one-parameter  group  with  hamiltonian  function  Hb.  It  is  sufficient  to  show  that 
the  derivatives  with  respect  to  s  (for  5  =  0)  of  the  functions  Ha(g\x)  and  HAllv,Jx)  are  the  same. 
The  first  of  these  derivatives  is  the  value  at  x  of  the  Poisson  bracket  ( Ha ,  Hb).  The  second  is 
Hu  m(,y).  Since  the  action  is  Poisson,  the  theorem  is  proved.  □ 

Proof  of  the  corollary.  The  derivative,  in  the  direction  of  the  phase  flow  with  hamiltonian 
function  H ,  of  each  component  of  the  momentum  is  zero,  since  it  is  equal  to  the  derivative  of 
function  H  in  the  direction  of  the  phase  flow  corresponding  to  a  one-parameter  subgroup  of  G. 

□ 


B  The  reduced  phase  space 

Suppose  that  we  are  given  a  Poisson  action  of  a  group  G  on  a  symplectic 
manifold  M.  Consider  a  level  set  of  the  momentum,  i.e.,  the  inverse  image  of 
some  point  peg*  under  the  map  P.  We  denote  this  set  by  Mp,  so  that 
(Figure  238) 

Mp  =  P~l(p). 

In  many  important  cases  the  set  M p  is  a  manifold.  For  example,  this  will 
be  so  if  p  is  a  regular  value  of  the  momentum,  i.e.,  if  the  differential  of  the  map  P 
at  each  point  of  the  set  Mp  maps  the  tangent  space  to  M  onto  the  whole 
tangent  space  to  9*. 

In  general,  a  Lie  group  G  acting  on  M  takes  the  sets  Mp  into  one  another. 
However,  the  stationary  subgroup  of  a  point  p  in  the  co-adjoint  representa¬ 
tion  (i.e.,  the  subgroup  consisting  of  those  elements  g  of  the  group  G  for 
which  Ad*p  =  p)  leaves  Mp  fixed.  We  denote  this  stationary  subgroup  by 
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Gp.  The  group  Gp  is  a  Lie  group,  and  it  acts  on  the  level  set  Mp  of  the  mo¬ 
mentum. 

The  reduced  phase  space  is  obtained  from  Mp  by  factoring  by  the  action 
of  the  group  Gp .  In  order  for  such  a  factorization  to  make  sense,  it  is  necessary 
to  make  several  assumptions.  For  example,  it  is  sufficient  to  assume  that 

1.  p  is  a  regular  value,  so  that  Mp  is  a  manifold, 

2.  The  stationary  subgroup  Gp  is  compact,  and 

3.  The  elements  of  the  group  Gp  act  on  Mp  without  fixed  points. 


Remark.  These  conditions  can  be  weakened.  For  example,  instead  of  compactness  of  the 
group  G,,  we  can  require  that  the  action  be  proper  (i.e.,  that  the  inverse  images  of  compact  sets 
under  the  mapping  ( g ,  x )  -*  (g(x).  x)  are  compact).  For  example,  the  actions  of  a  group  on 
itself  by  left  and  right  translation  are  always  proper. 

If  conditions  (1),  (2),  and  (3)  are  satisfied,  then  it  is  easy  to  give  the  set  of 
orbits  of  the  action  of  Gp  on  Mp  the  structure  of  a  smooth  manifold.  Namely, 
a  chart  on  a  neighborhood  of  a  point  x  e  Mp  is  furnished  by  any  local  trans¬ 
versal  to  the  orbit  Gpxt  whose  dimension  is  equal  to  the  codimension  of  the 
orbit. 

The  resulting  manifold  of  orbits  is  called  the  reduced  phase  space  of  a 
system  with  symmetry. 

We  will  denote  the  reduced  phase  space  corresponding  to  a  value  of  the 
momentum  by  Fp .  The  manifold  Fp  is  the  base  space  of  the  bundle  it :  Mp  -►  Fp 
with  fiber  diffeomorphic  to  the  group  Gp. 

There  is  a  natural  symplectic  structure  on  the  reduced  phase  space  Fp. 
Namely,  consider  any  two  vectors  ^  and  q  tangent  to  Fp  at  the  point  f.  The 
point  /  is  one  of  the  orbits  of  the  group  Gp  on  the  manifold  Mp.  Let  x  be 
one  of  the  points  of  this  orbit.  The  vectors  £  and  rj  tangent  to  Fp  are  obtained 
from  some  vectors  and  f  tangent  to  Mp  at  some  point  x  by  the  projection 
n:  Mp  -►  Fp. 

Definition.  The  skew-scalar  product  of  two  vectors  £  and  rj  which  are  tangent 

to  a  reduced  phase  space  at  the  same  point,  is  the  skew-scalar  product  of 
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the  corresponding  vectors  and  rj\  tangent  to  the  original  symplectic 
manifold  M: 

[&  =  [f\  *']■ 

Theorem.101  The  skew-scalar  product  of  the  vectors  £  and  r\  does  not  depend 
on  the  choices  of  the  point  x  and  representatives  and  and  gives  a 
symplectic  structure  on  the  reduced  phase  space. 

Corollary.  The  reduced  phase  space  is  even-dimensional. 

Proof  of  the  theorem.  We  look  at  the  following  two  spaces  in  the  tangent 
space  to  M  at  x : 

T{MP\  the  tangent  space  to  the  level  manifold  Mp,  and 
T(GX),  the  tangent  space  to  the  orbit  of  the  group  G. 

Lemma.  These  two  spaces  are  skew-orthogonal  complements  to  one  another 
in  TM. 

Proof.  A  vector  £  lies  in  the  skew-orthogonal  complement  to  the  tangent  plane  of  an  orbit  of 
the  group  G  if  and  only  if  the  skew-scalar  product  of  the  vector  £  with  velocity  vectors  of  the 
hamiltonian  flow  of  the  group  G  is  equal  to  zero  (by  definition).  But  these  skew-scalar  products 
are  equal  to  the  derivatives  of  the  corresponding  hamiltonian  functions  in  the  direction 
Therefore,  the  vector  £  lies  in  the  skew-orthogonal  complement  to  the  orbit  of  G  if  and  only  if 
the  derivative  of  the  momentum  in  the  direction  £  is  equal  to  zero,  i.e.,  if  £  lies  in  T(Mp).  □ 

The  representatives  £'  and  tj  are  defined  up  to  addition  of  a  vector  from  the  tangent  plane 
to  the  orbit  of  the  group  Gp.  But  this  tangent  plane  is  the  intersection  of  the  tangent  planes  to 
the  orbit  G.x  and  to  the  manifold  Mp  (by  the  last  theorem  of  part  A).  Consequently,  the  addition 
to  £'  of  a  vector  from  T(Gpx)  does  not  change  the  skew-scalar  product  with  any  vector  if  from 
T(Mp)  (since  by  the  lemma  T(Gpx )  is  skew-orthogonal  to  T(Mp)).  Thus,  we  have  shown  the 
independence  from  the  representatives  £'  and  7'. 

The  independence  of  the  quantity  [£,  t]]p  from  the  choice  of  the  point  x  of  the  orbit  f  follows 
from  the  symplectic  nature  of  the  action  of  the  group  G  on  M  and  the  invariance  of  Mp.  Thus 
we  have  defined  a  differential  2-form  on  F p: 

Hp(£,  7)  =  K.  7]P- 

It  is  nondegenerate,  since  if  [£,  r/]p  =  0  for  every  7,  then  the  corresponding  representative 
£'  is  skew-orthogonal  to  all  vectors  in  T(MP).  Therefore,  £'  must  be  the  skew-orthogonal  com¬ 
plement  to  T(M p)  in  TM.  Then  by  the  lemma  c'  £  T(Gx).  i.e..  c  =  0. 

The  form  Qp  is  closed.  In  order  to  verify  this  we  consider  a  chart,  i.e .  a  piece  of  submanifold 
in  Mp.  transversallv  intersecting  the  orbit  of  the  group  Gp  in  one  point. 

The  formfir  is  represented  in  this  chart  by  a  2-form  induced  from  the  2-form  a  which  defines 
the  symplectic  structure  in  the  whole  space  M.  by  means  of  the  embedding  of  the  submanifold 
piece.  Since  the  form  w  is  closed,  the  induced  form  is  also  closed.  The  theorem  is  proved.  □ 


101  The  theorem  was  first  formulated  in  this  form  by  Marsden  and  Weinstein.  Many  special 
cases  have  been  considered  since  the  time  of  Jacobi  and  used  by  Poincare  and  his  successors  in 
mechanics,  by  Kirillov  and  Kostant  in  group  theory,  and  by  Faddeev  in  the  general  theory  of 
relativity. 
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Example  1.  Let  M  =  R2"  be  euclidean  space  of  dimension  2 n  with  coordin¬ 
ates  pk,  qk  and  2-form  £  dpk  a  dqk.  Let  G  =  S1  be  the  circle,  and  let  the 
action  of  G  on  M  be  given  by  the  hamiltonian  of  a  harmonic  oscillator 

H  =  i  I  (Pk  +  qll 

Then  the  momentum  mapping  is  simply  H  :  R2"  -*•  R,  a  nonzero  momen¬ 
tum  level  manifold  is  a  sphere  S2n~  *,  and  the  quotient  space  is  the  complex 
projective  space  CP"~ l. 

The  preceding  theorem  defines  a  symplectic  structure  on  this  complex 
projective  space.  It  is  easy  to  verify  that  this  structure  coincides  (up  to  a 
multiple)  with  the  one  we  constructed  in  Appendix  3. 

Example  2.  Let  V  be  the  cotangent  bundle  of  a  Lie  group,  G  the  same  group 
and  the  action  defined  by  left  translation.  Then  Mp  is  a  submanifold  of  the 
cotangent  bundle  of  G,  formed  by  those  vectors  which,  after  right  translation 
to  the  identity  of  the  group,  define  the  same  element  in  the  dual  space  to  the 
Lie  algebra. 

The  manifolds  Mp  are  diffeomorphic  to  the  group  itself  and  are  right- 
invariant  cross-sections  of  the  cotangent  bundle.  All  the  values  p  are  regular. 

The  stationary  subgroup  Gp  of  the  point  p  consists  of  those  elements  of 
the  group  for  which  left  and  right  translation  of  p  give  the  same  result.  The 
actions  of  elements  different  from  the  identity  of  Gp  on  Mp  have  no  fixed 
points  (since  there  are  none  by  right  translation  of  the  group  onto  itself). 

The  group  Gp  acts  properly  (cf.  remark  above).  Consequently,  the  space 
of  orbits  of  the  group  Gp  on  Mp  is  a  symplectic  manifold. 

But  this  space  of  orbits  is  easily  identified  with  the  orbit  of  the  point  p 
in  the  co-adjoint  representation.  Actually,  we  map  the  right-invariant 
section  Mp  of  the  cotangent  bundle  into  the  cotangent  space  to  the  group  at 
the  identity  with  left  translations.  We  get  a  mapping 

it-  Mp~*  g*. 

The  image  of  this  mapping  is  the  orbit  of  the  point  p  in  the  co-adjoint 
representation,  and  the  fibers  are  the  orbits  of  the  action  of  the  group  Gp. 
The  symplectic  structure  of  the  reduced  phase  space  thus  defines  a  symplectic 
structure  in  the  orbits  of  the  co-adjoint  representation. 

It  is  not  hard  to  verify  by  direct  calculation  that  this  is  the  same  structure 
which  we  discussed  in  Appendix  2. 

Examplf  3.  Let  the  group  G  =  S1,  the  circle,  and  let  it  act  without  fixed 
points  on  a  manifold  V.  Then  there  is  an  action  of  the  circle  on  the  cotangent 
bundle  M  =  T*V.  We  can  define  momentum  level  manifolds  Mp  (of  co¬ 
dimension  1  in  M)  and  quotient  manifolds  Fp  (the  dimension  of  which  is  2 
less  than  the  dimension  of  M). 
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In  addition,  we  can  construct  a  quotient  manifold  of  the  configuration 
space  V  by  identifying  the  points  of  each  orbit  of  the  group  on  V.  We  denote 
this  quotient  manifold  by  W. 

Theorem.  The  reduced  phase  space  Fp  is  symplectic  and  diffeomorphic  to  the 
cotangent  bundle  of  the  quotient  configuration  manifold  W. 


Proof.  Let  n :  V  -*  VP  be  the  factorization  map,  and  ai  e  T*  VP  a  1  -form  on  VP  at  the  point  w  —  nv. 
The  form  n*a>  on  V  at  the  point  v  belongs  to  M0  and  projects  to  a  point  in  the  quotient  F0. 
Conversely,  the  elements  of  F0  are  the  invariant  1 -forms  on  V  which  are  equal  to  zero  on  the 
orbits;  they  define  1-forms  in  W.  We  have  constructed  a  mapping  T*W  -*  F0:  it  is  easy  to  see 
that  this  is  a  symplectic  diffeomorphism. 

The  case  p  #  0  is  reduced  to  the  case  p  -  0  as  follows.  Consider  a  riemannian  metric  on 
V,  invariant  with  respect  to  G.  The  intersection  of  Mp  with  the  cotangent  plane  to  V  at  the  point  v 
is  a  hyperplane.  The  quadratic  form  defined  by  the  metric  has  a  unique  minimum  point  S(v)  in 
this  hyperplane.  Subtraction  of  the  vector  5(n)  carries  the  hyperplane  MpnT*VP  into 
M0  n  T*F„,  and  we  obtain  a  possibly  nonsymplectic  diffeomorphism  Fp  -*■  F0. 

The  difference  between  the  symplectic  structures  on  T*W  induced  by  that  of  Fp  and  F0  is  a 
2-form,  induced  by  a  2-form  on  VP.  D 

C  Applications  to  the  study  of  stationary  rotations 
and  bifurcations  of  invariant  manifolds 

Suppose  that  we  are  given  a  Poisson  action  of  a  group  G  on  a  symplectic 
manifold  M;  let  H  be  a  function  on  M  invariant  under  G.  Let  Fp  be  a  reduced 
phase  space  (we  assume  that  the  conditions  under  which  this  can  be  defined 
are  satisfied). 

The  hamiltonian  field  with  hamiltonian  function  H  is  tangent  to  every 
momentum  level  manifold  Mp  (since  momentum  is  a  first  integral).  The 
induced  field  on  Mp  is  invariant  with  respect  to  Gp  and  defines  a  field  on  the 
reduced  phase  space  Fp.  This  vector  field  on  Fp  will  be  called  the  reduced 
field. 

Theorem.  The  reduced  field  on  the  reduced  phase  space  is  hamiltonian.  The 
value  of  the  hamiltonian  function  of  the  reduced  field  at  any  point  of  the 
reduced  phase  space  is  equal  to  the  value  of  the  original  hamiltonian  function 
at  the  corresponding  point  of  the  original  phase  space. 

Proof.  The  relation  defining  a  hamiltonian  field  XH  with  hamiltonian  W  on  a  manifold  M 
with  form  w 


dH{c,)  =  w(q,  Xtl)  for  every  £ 

implies  ah  analogous  relation  for  the  reduced  field  in  view  of  the  definition  of  the  symplectic 
structure  on  Fp.  LI 

Example.  Consider  an  asymmetric  rigid  body,  fixed  at  a  stationary  point, 
under  the  action  of  the  force  of  gravity  (or  any  potential  force  symmetric 
with  respect  to  the  vertical  axis). 
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The  group  S1  of  rotations  with  respect  to  a  vertical  line  acts  on  the  con¬ 
figuration  space  SO(3).  The  hamihonian  function  is  invariant  under  rota¬ 
tions,  and  therefore  we  obtain  a  reduced  system  on  the  reduced  phase  space. 

The  reduced  phase  space  is,  in  this  case,  the  cotangent  bundle  of  the 
quotient  configuration  space  (cf.  Example  3  above).  Factorization  of  the 
configuration  space  by  the  action  of  rotations  around  the  vertical  axis  was 
done  by  Poisson  in  the  following  way. 

We  will  specify  the  position  of  the  body  by  giving  the  position  of  an  ortho¬ 
normal  frame  (el5  e2,  e3).  The  three  vertical  components  of  the  basic  vectors 
give  a  vector  in  three-dimensional  euclidean  space.  The  length  of  this  vector 
is  1  (why?).  This  Poisson  vector102  y  determines  the  original  frame  up  to 
rotations  around  a  vertical  line  (why  ?). 

Thus  the  quotient  configuration  space  is  represented  by  a  two-dimensional 
sphere  S2,  and  the  reduced  phase  space  is  the  cotangent  bundle  T*S2  with  a 
nonstandard  symplectic  structure.  The  reduced  hamiltonian  function  on  the 
cotangent  bundle  is  represented  as  the  sum  of  the  “kinetic  energy  of  the 
reduced  motion,”  which  is  quadratic  in  the  cotangent  vectors,  and  the 
“effective  potential”  (the  sum  of  the  potential  energy  and  the  kinetic  energy  of 
rotation  around  a  vertical  line). 

The  transition  to  the  reduced  phase  space  in  this  case  is  almost  by  “elimination  of  the  cyclic 
coordinate  <p."  The  difference  is  that  the  usual  procedure  of  elimination  requires  that  the  con¬ 
figuration  or  phase  space  be  a  direct  product  by  the  circle,  whereas  in  our  case  we  have  only  a 
bundle.  This  bundle  can  be  made  a  direct  product  by  decreasing  the  size  of  the  configuration 
space  (i.e.,  by  introducing  coordinates  with  singularities  at  the  poles);  the  advantage  of  the 
approach  above  is  that  it  makes  it  clear  that  there  are  no  real  singularities  (except  singularities 
of  the  coordinate  system)  near  the  poles. 

Definition.  The  phase  curves  in  M  which  project  to  equilibrium  positions  in 
the  reduced  system  on  the  reduced  phase  space  Fp  are  called  the  relative 
equilibria  of  the  original  system. 

Example.  Stationary  rotations  of  a  rigid  body  which  is  fixed  at  its  center  of 
mass  are  relative  equilibria.  In  the  same  way,  rotations  of  a  heavy  rigid  body 
with  constant  speed  around  the  vertical  axis  are  relative  equilibria. 

Theorem.  A  phase  curve  of  a  system  with  a  G-invariant  hamiltonian  function  is  a 
relative  equilibrium  if  and  only  if  it  is  the  orbit  of  a  one-parameter  subgroup 
of  G  in  the  original  phase  space. 

Proof.  It  is  clear  that  a  phase  curve  which  is  an  orbit  projects  to  a  point.  If  a  phase  curve  x(t) 
projects  to  a  point,  then  it  can  be  expressed  uniquely  in  the  form  x(f)  =  g(r)x(0),  and  it  is  then 
easy  to  see  that  {^(r)}  is  a  subgroup.  □ 


102  Poisson  showed  that  the  equations  of  motion  of  a  heavy  rigid  body  can  be  written  in  terms 
of  y  in  a  remarkably  simple  form,  the  “Euler-Poisson  equations”: 

dy 

—  -  [M,  to]  =  ng[y,  I]  —  =  [y,  to]. 
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Corollary  1.  An  asymmetrical  rigid  body  in  an  axially  symmetric  potential 
field,  fixed  at  a  point  on  the  axis  of  the  field,  has  at  least  two  stationary 
rotations  ( for  every  value  of  the  angular  momentum  with  respect  to  the  axis 
of  symmetry). 

Corollary  2.  An  axially  symmetric  rigid  body  fixed  at  a  point  on  the  axis  of 
symmetry,  has  at  least  two  stationary  rotations  {for  every  value  of  the  angular 
momentum  with  respect  to  the  axis  of  symmetry ). 

Both  corollaries  follow  from  the  fact  that  a  function  on  the  sphere  has  at 
least  two  critical  points. 

Another  application  of  relative  equilibria  is  that  they  can  be  used  to 
investigate  modifications  of  the  topology  of  invariant  manifolds  under 
changes  of  the  energy  and  momentum  values. 

Theorem.  The  critical  points  of  the  momentum  and  energy  mapping 

P  x  H:  M  -*  g*  x  R 

on  a  regular  momentum  level  set  are  exactly  the  relative  equilibria. 

Proof.  The  critical  points  of  the  mapping  P  x  H  are  the  conditional  extrema  of  H  on  the 
momentum  level  manifold  M  p  (since  this  level  manifold  is  regular,  i.e.,  for  every  x  in  M  p,  we 
have  P*TMX  -  Tq*). 

After  factorization  by  Gr,  the  conditional  extrema  of  H  on  Mp  define  the  critical  points  of 
the  reduced  hamiltonian  function  (since  H  is  invariant  under  Gp).  □ 

The  detailed  study  of  relative  equilibria  and  singularities  of  the  energy- 
momentum  mapping  is  not  simple  and  has  not  been  completely  carried  out, 
even  in  the  classical  problem  of  the  motions  of  an  asymmetrical  rigid  body 
in  a  gravitational  field.  The  case  when  the  center  of  gravity  lies  on  one  of  the 
principal  axes  of  inertia  is  treated  in  the  supplement  written  by  S.  B.  Katok 
to  the  Russian  translation103  of  the  article  by  S.  Smale  cited  in  the  beginning 
of  this  appendix.  In  this  problem  the  dimension  of  the  phase  space  is  six,  and 
the  group  is  the  circle;  the  reduced  phase  space  T*S2  is  four-dimensional. 

The  nonsingular  energy  level  manifolds  in  the  reduced  phase  space  are 
(depending  on  the  values  of  momentum  and  energy)  of  the  following  four 
forms:  S3,  S2  x  S1,  RP 3,  and  a  “pretzel”  obtained  from  the  three-sphere  S3 
by  attaching  two  “handles”  of  the  form 

S1  x  D2  ( D 2  —  the  disc  {(x,  y)|x2  +  y2  <  1}). 


103  Uspekhi  Matematicheskikh  Nuuk  27,  no.  2  (1972)  78-133. 
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In  this  appendix  we  give  a  list  of  normal  forms  to  which  we  can  reduce  a 
quadratic  hamiltonian  function  by  means  of  a  real  symplectic  transformation. 
This  list  was  composed  by  D.  M.  Galin  based  on  the  work  of  J.  Williamson 
in  “  On  an  algebraic  problem  concerning  the  normal  forms  of  linear  dynamical 
systems,”  Amer.  J.  of  Math.  58,  (1936),  141-163.  Williamson’s  paper  gives 
the  normal  forms  to  which  a  quadratic  form  in  a  symplectic  space  over  any 
field  can  be  reduced. 

A  Notation 

We  will  write  the  hamiltonian  as 


H  =  \{Ax,  x), 

where  x  =  (pu  . . . ,  p„;  qlf . . . ,  q„)  is  a  vector  written  in  a  symplectic  basis 
and  A  is  a  symmetric  linear  operator.  The  canonical  equations  then  have  the 
form 


x  =  I  Ax,  where  /  — 

By  the  eigenvalues  of  the  hamiltonian  we  will  mean  the  eigenvalues  of  the 
linear  infinitesimally-symplectic  operator  I  A.  In  the  same  way,  by  a  Jordan 
block  we  will  mean  a  Jordan  block  of  the  operator  I  A. 

The  eigenvalues  of  the  hamiltonian  are  of  four  types :  real  pairs  (a,  —  a), 
purely  imaginary  pairs  (ib,  —  ib),  quadruples  ( +  a  ±  ib),  and  zero  eigenvalues. 

The  Jordan  blocks  corresponding  to  the  two  members  of  a  pair  or  four 
members  of  a  quadruple  always  have  the  same  structure. 

In  the  case  when  the  real  part  of  an  eigenvalue  is  zero,  we  have  to  dis¬ 
tinguish  the  Jordan  blocks  of  even  and  odd  order.  There  are  an  even  number  of 
blocks  of  odd  order  with  zero  eigenvalue  and  they  can  be  naturally  divided 
into  pairs. 

A  complete  list  of  normal  forms  follows. 

B  Hamiltonians 

For  a  pair  of  Jordan  blocks  of  order  k  with  eigenvalues  ±a,  the  hamiltonian 
is 

k  k  1 

H  =  -a^PjQj  +  I  Pjqj+i- 

j= i  j=  i 

For  a  quadruple  of  Jordan  blocks  of  order  k  with  eigenvalues  ±a  ±  bi 
the  hamiltonian  is 

2k  k  2k  —  2 

H  =  -a  X  PjQj  +  b  X  iPii-xqij  ~  P2j<?2j-i)  +  X  PjQj+2- 
j=  i  j=i  J= i 
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Fora  pair  of  Jordan  blocks  of  order  k  with  eigenvalue  zero  the  hamiltonian 
is 


*- 1 

H  =  £  Pj<lj+ 1  (for  k=  1,H  =  0). 

7=  i 

For  a  Jordan  block  of  order  2k  with  eigenvalue  zero,  the  hamiltonian  is 
of  one  of  the  following  two  inequivalent  types : 


1  /ft- 1  ft  \  ft-i 

H  =  ±  5  L  PjPk-j  ~  S  Wk-j+i  )  -  I  Pjqj+ 1 

(for  k  =  1  this  is  H  =  ±  ^qj). 

For  a  pair  of  Jordan  blocks  of  odd  order  2k  +  1  with  purely  imaginary 
eigenvalues  ±bi,  the  hamiltonian  is  of  one  of  the  following  two  inequivalent 
types : 

"  Ik 

X  (b2P2jP2k-2j+2  +  q2k-2j+2) 


k  +  1 

-  Z(b2p2j-iP2k  -2>+3  +  Qlj-  1^2k-2j  +  3) 
7=1 


2k 


-  Y.pjqj+1- 

7=1 


For  k  =  0,  H  =  ±j(b2pi  +  ql). 

For  a  pair  of  Jordan  blocks  of  even-order  2k  with  purely  imaginary  eigen¬ 
values  ±bi,  the  hamiltonian  is  of  one  of  the  following  two  inequivalent  types: 


H=±-2 


L7 
fc-  1 


v  /I  \ 

L  1^2  ^2j- iQlk-lj+l  +  Q2jQ2k-2j  +  2  I 


~  X!  b2p2j+lP2k-2j+l  +  p2j+  2  P2k  —  2j+  2) 

7=1 


k 


b2  X  P2j- iQ2j  + 

7=1 


k 


S  P2j<l2j-  1 

7=1 


for  k  =  1,  H  =  +  x  (-TT 


2  U 


.2 


b2Plq2  +  p2«i 


Williamson’s  theorem.  A  real  symplectic  vector  space  with  a  given  quadratic 
form  H  can  be  decomposed  into  a  direct  sum  of  pairwise  skew  orthogonal  real 
symplectic  subspaces  so  that  the  form  H  is  represented  as  a  sum  of  forms  of 
the  types  indicated  above  on  these  subspaces. 

C  Nonremovable  Jordan  blocks 

An  individual  hamiltonian  in  “general  position”  does  not  have  multiple 
eigenvalues  and  reduces  to  a  simple  form  (all  the  Jordan  blocks  are  of  first 
order).  However,  if  we  consider  not  an  individual  hamiltonian  but  a  whole 
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family  of  systems  depending  on  parameters,  then  for  some  exceptional 
values  of  the  parameters  more  complicated  Jordan  structures  can  arise.  We 
can  get  rid  of  some  of  these  by  a  small  change  of  the  family;  others  are  non¬ 
removable  and  only  slightly  deformed  after  a  small  change  of  the  family.  If 
the  number  /  of  parameters  of  the  family  is  finite,  then  the  number  of  non¬ 
removable  types  in  /-parameter  families  is  finite.  The  theorem  of  Galin 
formulated  below  allows  us  to  count  all  these  types  for  any  fixed  I. 

We  denote  by  nfz)  >  n2(z)  >  •  •  •  >  ns(z)  the  dimensions  of  the  Jordan 
blocks  with  eigenvalues  z  #  0,  and  by  mt  >  m2  >  •  ■  •  >  mu  and  mt  > 
m2  ^  '  ‘  •  ^  the  dimensions  of  the  Jordan  blocks  with  eigenvalues  zero, 
where  the  mi  are  even  and  the  w,  are  odd  (of  every  pair  of  blocks  of  odd 
dimension,  only  one  is  considered). 


Theorem.  In  the  space  of  all  hamiltonians,  the  manifold  of  hamiltonians  with 
Jordan  blocks  of  the  indicated  dimensions  has  codimension 

(2/  -  i)mJ 

1  J=1 

+  Z  [2(2/  -  l>«t  +  1]  +  2  Z  Z  min{m;-,  w*}. 

j= i  j= i fc=i 

( Note  that,  if  zero  is  not  an  eigenvalue,  then  only  the  first  term  in  the  sum 
is  not  zero.) 


1  V 

C  =  2  £ 


s(z) 

Z  (2 j  -  1  )n/z)  -  1 

U=i 


Corollary.  In  l-parameter  families  in  general  position  of  linear  hamiltonian 
systems,  the  only  systems  which  occur  are  those  with  Jordan  blocks  such  that 
the  number  c  calculated  by  the  formula  above  is  not  greater  than  l :  all 
cases  with  larger  c  can  be  eliminated  by  a  small  change  of  the  family. 


Corollary.  In  one •  and  two-parameter  families,  nonremovable  Jordan  blocks  of 
only  the  following  12  types  occur: 

l  =  1 ;  ( ±  a)2,  ( +  ia)2,  02 

(here  the  Jordan  blocks  are  denoted  by  their  determinants;  for  example, 
(±a)2  denotes  a  pair  of  Jordan  blocks  of  order  2  with  eigenvalues  a  and 
—  a,  respectively ; 

l  =  2 :(±a)3,  (±ai)3,  (±a  ±  bi)2,  04,  (±a)2(±b)2,  (±ai)2(±bi)2, 

( +  a)2(  ±  bi)2,  ( +  a)2 02,  ( ±  ai)2 02 

(the  remaining  eigenvalues  are  simple). 
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Galin  has  also  computed  the  normal  forms  to  which  one  can  reduce  any 
family  of  linear  hamiltonian  systems  which  depend  smoothly  on  parameters, 
by  using  a  symplectic  linear  change  of  coordinates  which  depends  smoothly 
on  the  parameters.  For  example,  for  the  simplest  Jordan  square  (±a)2,  the 
normal  form  of  the  hamiltonian  will  be 

H(X)  =  +  p2qi)  +  Pi<h  +  ^iPi4i  +  ^2  P2  q  1 

(Aj  and  k2  are  the  parameters). 
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stationary  points  and  closed  trajectories 

In  studying  the  behavior  of  solutions  to  Hamilton’s  equations  near  an 
equilibrium  position,  it  is  often  insufficient  to  look  only  at  the  linearized 
equation.  In  fact,  by  Liouville’s  theorem  on  the  conservation  of  volume, 
it  is  impossible  to  have  asymptotically  stable  equilibrium  positions  for  hamil¬ 
tonian  systems.  Therefore,  the  stability  of  the  linearized  system  is  always 
neutral :  the  eigenvalues  of  the  linear  part  of  a  hamiltonian  vector  field  at  a 
stable  equilibrium  position  all  lie  on  the  imaginary  axis. 

For  systems  of  differential  equations  in  general  form,  such  neutral 
stability  can  be  destroyed  by  the  addition  of  arbitrarily  small  nonlinear 
terms.  For  hamiltonian  systems  the  situation  is  more  complicated.  Suppose, 
for  example,  that  the  quadratic  part  of  the  hamiltonian  function  at  an 
equilibrium  position  (which  determines  the  linear  part  of  the  vector  field)  is 
(positive  or  negative)  definite.  Then  the  hamiltonian  function  has  a  maximum 
or  minimum  at  the  equilibrium  position.  Therefore,  this  equilibrium  position 
is  stable  (in  the  sense  of  Liapunov,  but  not  asymptotically),  not  only  for  the 
linearized  system  but  also  for  the  entire  nonlinear  system. 

On  the  other  hand,  the  quadratic  part  of  the  hamiltonian  function  at  a 
stable  equilibrium  position  may  not  be  definite.  A  simple  example  is  supplied 
by  the  function  H  —  pj  +  q{  —  p\  —  q\.  To  investigate  the  stability  of 
systems  with  this  kind  of  quadratic  part,  we  must  take  into  account  terms  of 
degree  >  3  in  the  Taylor  series  of  the  hamiltonian  function  (i.e.,  the  terms  of 
degree  >2  for  the  phase  velocity  vector  field).  It  is  useful  to  carry  out  this 
investigation  by  reducing  the  hamiltonian  function  (and,  therefore,  the 
hamiltonian  vector  field)  to  the  simplest  possible  form  by  a  suitable  canonical 
change  of  variables.  In  other  words,  it  is  useful  to  choose  a  canonical  co¬ 
ordinate  system,  near  the  equilibrium  position,  in  which  the  hamiltonian 
function  and  equations  of  motion  are  as  simple  as  possible. 

The  analogous  question  for  general  (non-hamiltonian)  vector  fields  can 
be  solved  easily :  there  the  general  case  is  that  a  vector  field  in  a  neighborhood 
of  an  equilibrium  position  is  linear  in  a  suitable  coordinate  system  (the 
relevant  theorems  of  Poincare  and  Siegel  can  be  found,  for  instance,  in  the 
book,  Lectures  on  Celestial  Mechanics,  by  C.  L.  Siegel  and  J.  Moser, 
Springer-Verlag,  1971.) 

In  the  hamiltonian  case  the  picture  is  more  complicated.  The  first  difficulty 
is  that  reduction  of  the  hamiltonian  field  to  a  linear  normal  form  by  a 
canonical  change  of  variables  is  generally  not  possible.  We  can  usually  kill 
the  cubic  part  of  the  hamiltonian  function,  but  we  cannot  kill  all  the  terms  of 
degree  four  (this  is  related  to  the  fact  that,  in  a  linear  system,  the  frequency  of 
oscillation  does  not  depend  on  the  amplitude,  while  in  a  nonlinear  system  it 
generally  does).  This  difficulty  can  be  surmounted  by  the  choice  of  a  nonlinear 
normal  form  which  takes  the  frequency  variations  into  account.  As  a  result, 
we  can  (in  the  “non-resonance”  case)  introduce  action-angle  variables  near 
an  equilibrium  position  so  that  the  system  becomes  integrable  up  to  terms  of 
arbitrary  high  degree  in  the  Taylor  series. 
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This  method  allows  us  to  study  the  behavior  of  systems  over  the  course  of 
large  intervals  of  time  for  initial  conditions  close  to  equilibrium.  However, 
it  is  not  sufficient  to  determine  whether  an  equilibrium  position  will  be 
Liapunov  stable  (since  on  an  infinite  time  interval  the  influence  of  the  dis¬ 
carded  remainder  term  of  the  Taylor  series  can  destroy  the  stability).  Such 
stability  would  follow  from  an  exact  reduction  to  an  analogous  normal  form 
which  did  not  disregard  remainder  terms.  However,  we  can  show  that 
this  exact  reduction  is  generally  not  possible,  and  formal  series  for  canonical 
transformations  reducing  a  system  to  normal  form  generally  diverge. 

The  divergence  of  these  series  is  connected  with  the  fact  that  reduction 
to  normal  form  would  imply  simpler  behavior  of  the  phase  curves  (they 
would  have  to  be  conditionally-periodic  windings  of  tori)  than  that  which 
in  fact  occurs.  The  behavior  of  phase  curves  near  an  equilibrium  position  is 
discussed  in  Appendix  8.  In  this  appendix  we  give  the  formal  results  on  nor¬ 
malization  up  to  terms  of  high  degree. 

The  idea  of  reducing  hamiltonian  systems  to  normal  forms  goes  back  to 
Lindstedt  and  Poincare;104  normal  forms  in  a  neighborhood  of  an  equi¬ 
librium  position  were  extensively  studied  by  G.  D.  Birkhoff  (G.  D.  Birkhoff, 
Dynamical  Systems,  American  Math.  Society,  1927). 

Normal  forms  for  degenerate  cases  can  be  found  in  the  work  of  A.  D. 
Bruno,  “Analytic  forms  of  differential  equations,”  (Trudy  Moskovskovo 
matematischeskovo  obschchestva,  v.  25  and  v.  26). 

A  Normal  form  of  a  conservative  system  near  an 
equilibrium  position 

Suppose  that  in  the  linear  approximation  an  equilibrium  position  of  a 
hamiltonian  system  with  n  degrees  of  freedom  is  stable,  and  that  all  n  charac¬ 
teristic  frequencies  col5 . . . ,  con  are  different.  Then  the  quadratic  part  of  the 
hamiltonian  can  be  reduced  by  a  canonical  linear  transformation  to  the 
form 


H  =  l(ft>l(Pl  +  ql)  +  ■  •  •  +  l^niPn  +  dn))- 

(Some  of  the  numbers  cok  may  be  negative). 

Definition.  The  characteristic  frequencies  co1; .  ..,cok  satisfy  a  resonance 
relation  of  order  K  if  there  exist  integers  kt  not  all  equal  to  zero  such  that 

klo)l  +  •  •  •  +  knco„  =  0,  |/cj  |  +  •  •  •  +  |/cn|  =  K. 

Definition.  A  Birkhoff  normal  form  of  degree  s  for  a  hamiltonian  is  a  poly¬ 
nomial  of  degree  s  in  the  canonical  coordinates  (Pt,  Qt)  which  is  actually 
a  polynomial  (of  degree  [s/2])  in  the  variables  t,  =  (Pf  +  Qf)/2. 


104  Cf.  H.  Poincare,  Les  Methodes  Nouvelles  de  la  Mecanique  Celeste,  Vol.  1,  Dover,  1957. 
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For  example,  for  a  system  with  one  degree  offreedom  the  normal  form  of  degree  2m  (or  2  m  +  1 ) 
looks  like 

^2m  =  H 2m  +-  1  =  U  l  T  +  d2  ^  +  '  1  '  +  Um  Tm,  T  =  ( P 2  +  Q2)/2, 

and  for  a  system  with  two  degrees  of  freedom  the  Birkhoff  normal  form  of  degree  4  will  be 

#4  =  «iTi  +  a2z2  +  fluT?  +  ai2  t,t2  +  a22  t22. 

The  coefficients  a,  and  a2  are  characteristic  frequencies,  and  the  coefficients  au  describe  the 
dependence  of  the  frequencies  on  the  amplitude. 

Theorem,  Assume  that  the  characteristic  frequencies  a>(  do  not  satisfy  any 
resonance  relation  of  order  s  or  smaller .  Then  there  is  a  canonical  co¬ 
ordinate  system  in  a  neighborhood  of  the  equilibrium  position  such  that 
the  hamiltonian  is  reduced  to  a  Birkhoff  normal  form  of  degree  s  up  to  terms 
of  order  s  +  1 : 

H(p,  q)  =  HS(P,Q )  +  R  R  =  0(|P|  +  \Q\)S+1. 


Proof.  The  proof  of  this  theorem  is  easy  to  carry  out  in  a  complex  coordinate  system 

:,  =  p,  +  iq,  wt  =  p,  -  iq, 

(upon  passing  to  this  coordinate  system  we  must  multiply  the  hamiltonian  by  -2i).  If  the  terms 
of  degree  less  than  N  entering  into  the  normal  form  are  not  already  killed,  then  the  transformation 
with  generating  function  Pq  +  SN(P,  q)  (where  SN  is  a  homogeneous  polynomial  of  degree  N) 
changes  only  terms  of  degree  N  and  higher  in  the  Taylor  expansion  of  the  hamiltonian  function. 

Under  this  transformation  the  coefficient  for  a  monomial  of  degree  N  in  the  hamiltonian 
function  having  the  form 

Z\  '  '  '  zn"W\'  '  '  '  w’(r"  (al  +  ‘  ‘  '  +  +  Pi  +  ‘  1  '  +  Pn  =  N) 

is  changed  into  the  quantity 

■'’7()[^l(/ll  -  3£,)  +  ■  •  ■  +  2„(/f„  -  2„)]. 

where  A,  =  iw,  and  where  is  the  coefficient  for  zV"  in  the  expansion  of  the  function  S.V(P,  q) 
in  the  variables  r  and  w. 

Under  the  assumptions  about  the  absence  of  resonance,  the  coefficient  of  sxfi  in  the  square 
brackets  is  not  zero,  except  in  the  case  when  our  monomial  can  be  expressed  in  terms  of  the 
product  z,w,  =  2i,  (i.e.,  when  all  the  a,  are  equal  to  the  (i,).  Thus  we  can  kill  all  terms  of  degree  N 
except  those  expressed  in  terms  of  the  variables  r, .  Setting  N  =  3, 4, . . . ,  s,  we  obtain  the  theorem. 

□ 

To  use  Birkhoff ’s  theorem,  it  is  helpful  to  note  that  a  hamiltonian  in  normal 
form  is  integrable.  Consider  the  “canonical  polar  coordinates”  zh  (ph  in 
which  Pt  and  Qt  can  be  expressed  by  the  formulas 

Pi  =  cos  (pi  Qi  =  y2^sin  (ph 

Since  the  hamiltonian  is  expressed  in  terms  of  only  the  action  variables  t(, 
the  system  is  integrable  and  describes  conditionally  periodic  notions  on  the 
tori  t  =  const  with  frequencies  co  —  dH/d r.  In  particular,  the  equilibrium 
position  P  =  Q  =  0  is  stable  for  the  normal  form. 


387 


Appendix  7:  Normal  forms  of  hamiltonian  systems  near  stationary  points 


B  Normal  form  of  a  canonical  transformation  near  a  stationary  point 

Consider  a  canonical  (i.e.  area-preserving)  mapping  of  the  two-dimensional 
plane  to  itself.  Assume  that  this  transformation  leaves  the  origin  fixed,  and 
that  its  linear  part  has  eigenvalue  X  =  e±lx  (i.e.,  is  a  rotation  by  angle  a  in  a 
suitable  symplectic  basis  with  coordinates  p,  q ).  We  will  call  such  a  trans¬ 
formation  elliptic. 

Definition.  A  Birkhoff  normal  form  of  degree  s  for  a  transformation  is  a  canon¬ 
ical  transformation  of  the  plane  to  itself  which  is  a  rotation  by  a  variable 
angle  which  is  a  polynomial  of  degree  not  more  than  m  =  [s/2]  -  1 
in  the  action  variable  z  of  the  canonical  polar  coordinate  system: 

(t,  (p)  -*■  (t,  (p  +  olq  +  oqr  +  •  •  •  +  amTm), 

where 

p  =  Jlz  cos  cp  q  =  v//2t  sin  <p. 

Theorem  2.  If  the  eigenvalue  X  of  an  elliptic  canonical  transformation  is  not  a 
root  of  unity  of  degree  s  or  less,  then  this  transformation  can  be  reduced  by  a 
canonical  change  of  variables  to  a  Birkhoff  normal  form  of  degree  s  with 
error  terms  of  degree  s  +  1  and  higher. 

The  multi-dimensional  generalization  of  an  elliptic  transformation  is  the 
direct  product  of  n  elliptic  rotations  of  the  planes  (p,,  qt)  with  eigenvalues 
X ,  =  e±ia'.  A  Birkhoff  normal  form  of  degree  s  is  given  by  the  formula 

/  5S\ 

(t,  (p)-*  It,  q>  +  Yxy 

where  S  is  a  polynomial  of  degree  not  more  than  [s/2]  in  the  action  variables 
Tl,  .  .  T„. 

Theorem  3.  If  the  eigenvalues  Xt  of  a  multi-dimensional  elliptic  canonical 
transformation  do  not  admit  resonances 

2*1  •  •  • /£•  =  1,  l/cj  +  +  |U  <s, 

then  this  transformation  can  be  reduced  to  a  Birkhoff  normal  form  of  degree  s 
( with  error  in  terms  of  degree  s  in  the  expansion  of  the  mapping  in  a  Taylor 
series  at  the  point  p  —  q  =  0). 

C  Normal  form  of  an  equation  with  periodic  coefficients 
near  an  equilibrium  position 

Let  p  =  q  =  0  be  an  equilibrium  position  of  a  system  whose  hamiltonian 
function  depends  27r-periodically  on  time.  Assume  that  the  linearized  equa¬ 
tion  can  be  reduced  by  a  linear  symplectic  time-periodic  transformation  to  an 
autonomous  normal  form  with  characteristic  frequencies  a>j, . . . ,  con. 
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We  say  that  a  system  is  resonant  of  order  K  >  0  if  there  is  a  relation 
k^co^  +  •  •  •  -f-  kn(on  +  k(j  —  0 

with  integers  k0,  /c1? . . . ,  kn  for  which  \  kx  |  +  •  •  •  +  |/c„|  =  K. 

Theorem.  If  a  system  is  not  resonant  of  order  s  or  less,  then  there  is  a  2n- 
periodic  time-dependent  canonical  transformation  reducing  the  system  in  a 
neighborhood  of  an  equilibrium  position  to  the  same  Birkhoff  normal  form 
of  degree  s  as  if  the  system  were  autonomous,  with  only  the  difference  that 
the  remainder  terms  R  of  degree  s  +  1  and  higher  will  depend  periodically 
on  time. 

Finally,  suppose  that  we  are  given  a  closed  trajectory  of  an  autonomous 
hamiltonian  system.  Then,  in  a  neighborhood  of  this  trajectory,  we  can 
reduce  the  system  to  normal  form  by  using  either  of  the  following  two 
methods : 

1.  Isoenergetic  reduction:  Fix  an  energy  constant  and  consider  a  neighbor¬ 
hood  of  the  closed  trajectory  on  the  (2 n  —  l)-dimensional  energy  level 
manifold  as  the  extended  phase  space  of  a  system  with  n  —  1  degrees  of 
freedom,  periodically  depending  on  time. 

2.  Surface  of  section:  Fix  an  energy  constant  and  value  of  one  of  the  co¬ 
ordinates  (so  that  the  closed  trajectory  intersects  the  resulting  (2 n  —  2)- 
dimensional  manifold  transversally).  Then  phase  curves  near  the  given 
one  define  a  mapping  of  this  (2 n  —  2)-dimensional  manifold  to  itself, 
with  a  fixed  point  on  the  closed  trajectory.  This  mapping  preserves  the 
natural  structure  on  our  {In  —  2)-dimensional  manifold,  and  we  can 
study  it  by  using  the  normal  form  in  Section  B. 

In  investigating  closed  trajectories  of  autonomous  hamiltonian  systems, 
a  phenomenon  arises  which  contrasts  with  the  general  theory  of  equilibrium 
positions  of  systems  with  periodic  coefficients.  The  fact  is  that  the  closed 
trajectories  of  an  autonomous  system  are  not  isolated,  but  form  (as  a  rule) 
one-parameter  families.  The  parameter  of  the  family  is  the  value  of  the  energy 
constant.  In  fact,  assume  that  for  some  choice  of  the  energy  constant  the 
closed  trajectory  intersects  transversally  the  {In  —  2)-dimensional  manifold 
described  above  in  the  (2 n  —  l)-dimensional  energy  level  manifold.  Then 
for  nearby  values  of  the  energy,  there  will  exist  a  similar  closed  trajectory. 
By  the  implicit  function  theorem  we  can  even  say  that  this  closed  trajectory 
depends  smoothly  on  the  energy  constant. 

If  we  now  wish  to  use  the  Birkhoff  normal  form  to  investigate  a  one- 
parameter  family  of  closed  trajectories,  we  encounter  the  following  difficulty. 
As  the  parameter  describing  the  family  varies,  the  eigenvalues  of  the  linearized 
problem  will  generally  change.  Therefore,  for  some  values  of  the  parameter 
we  will  inevitably  encounter  resonances,  obstructing  reduction  to  the  normal 
form. 
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Especially  dangerous  are  resonances  of  low  order,  since  they  influence 
the  first  few  terms  of  the  Taylor  series.  If  we  are  interested  in  a  closed  trajectory 
for  which  the  eigenvalues  nearly  satisfy  a  resonance  relation  of  low  order, 
then  the  Birkhofif  form  must  be  somewhat  modified.  Namely,  for  resonance 
of  order  N  some  of  the  expressions 

k0  -  OjO?!  —  at)  +  •••  +  CG„(fin  -  an)]>  l«l  +  101  = 

by  which  we  must  divide  to  kill  the  terms  of  order  N  in  the  hamiltonian 
function,  may  become  zero.  For  non-resonant  values  of  the  parameter  which 
are  close  to  resonance,  this  combination  of  characteristic  frequencies  is 
generally  not  zero,  but  very  small  (this  combination  is  therefore  called  a 
“small  denominator”). 

Division  by  a  small  denominator  leads  to  the  following  difficulties: 

1.  The  transformation  which  reduces  to  normal  form  depends  discon- 
tinuously  on  the  parameter  (it  has  poles  for  resonant  values  of  the  param¬ 
eter); 

2.  The  region  in  which  the  Birkhofif  normal  form  accurately  describes  the 
system  contracts  to  zero  at  resonance. 

In  order  to  get  rid  of  these  deficiencies,  we  must  give  up  trying  to  annihilate 
some  of  the  terms  of  the  hamiltonian  (namely,  those  which  become  resonant 
for  resonance  values  of  the  parameter).  Moreover,  these  terms  must  be 
preserved  not  only  for  resonance,  but  also  for  nearby  values  of  the  param¬ 
eter.105  The  normal  form  thus  obtained  is  somewhat  more  complicated  than 
the  usual  normal  form,  but  in  many  cases  it  gives  us  useful  information  on 
the  behavior  of  solutions  near  resonance. 

D  Example :  Resonance  of  order  3 

As  a  simple  example,  we  will  study  what  happens  to  a  closed  trajectory  of  an 
autonomous  hamiltonian  system  with  two  degrees  of  freedom,  for  which 
the  period  of  oscillation  (about  the  closed  trajectory)  of  neighboring  trajec¬ 
tories  is  three  times  the  period  of  the  closed  trajectory  itself.  By  what  we  said 
above,  this  problem  may  be  reduced  to  an  investigation  of  a  one-parameter 
system  of  non-autonomous  hamiltonian  systems  with  one  degree  of  freedom, 
27r-periodically  depending  on  time,  in  a  neighborhood  of  an  equilibrium 
position.  This  equilibrium  position  can  be  taken  as  the  origin  for  all  values  of 
the  parameter  (to  achieve  this  we  must  make  a  change  of  variables  depending 
on  the  parameter). 

Furthermore,  the  linearized  system  at  the  equilibrium  position  can  be 
converted  into  a  linear  system  with  constant  coefficients  by  a  27r-periodically 
time-dependent  linear  canonical  change  of  variables.  In  the  new  coordinates 
the  phase  flow  of  the  linearized  system  is  represented  as  a  uniform  rotation 

105  The  method  indicated  here  is  useful  not  only  in  investigating  hamiltonian  systems,  but  also 
in  the  general  theory  of  differential  equations.  Cf.,  for  example,  V.  I.  Arnold,  Lectures  on 
bifurcations  and  versal  families,"  Russian  Math.  Surveys  27,  No.  5,  1972,  54-123. 
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around  the  equilibrium  position.  The  angular  velocity  co  of  this  rotation 
depends  on  the  parameter. 

At  the  resonance  value  of  the  parameter,  co  =  \  (i.e,  after  time  2n,  we  have 
gone  one-third  of  the  way  around  the  origin).  The  derivative  of  the  anguiar 
velocity  co  with  respect  to  the  parameter  is  generally  not  zero.  Therefore,  we 
can  take  as  a  parameter  this  angular  velocity  or,  even  better,  its  difference 
from  We  will  denote  this  difference  by  e.  The  quantity  e  is  called  the 
frequency  deviation  or  detuning.  The  resonance  value  of  the  parameter  is 
e  =  0,  and  we  are  interested  in  the  behavior  of  the  system  for  small  e. 

If  we  disregard  the  nonlinear  terms  in  Hamilton’s  equations  and  dis¬ 
regard  the  frequency  deviation  e,  then  all  trajectories  of  our  system  become 
closed  after  making  three  revolutions  (i.e.,  they  have  period  67c).  We  now 
want  to  study  the  influence  of  the  nonlinear  terms  and  frequency  deviation 
on  the  behavior  of  the  trajectories.  It  is  clear  that  in  the  general  case  not  all  the 
trajectories  will  be  closed.  To  study  their  behavior,  it  is  useful  to  look  at 
the  normal  form. 

In  the  chosen  coordinate  system,  z  =  p  +  iq,z  =  p  -  iq,  the  hamiltonian 
function  has  the  form 

+  00 

-2 iH  =  -icozz  +  £  I  hmz*zpeikt  +  •  •  • , 

r  +  0=  3  k  =  -  ao 

where  the  dots  indicate  terms  of  order  higher  than  three,  and  where  co  - 

(3)  +  £- 

In  the  reduction  to  normal  form  we  can  kill  all  terms  of  degree  three 
except  those  for  which  the  small  denominator 

o)(ot  -  j 8)  +  k 

becomes  zero  at  resonance.  These  terms  can  be  described  also  as  those 
which  are  constant  along  trajectories  of  the  periodic  motion  obtained  by 
disregarding  the  frequency  deviation  and  nonlinearity.  They  are  called  the 
resonant  terms.  Thus,  for  resonance  co  =  the  resonant  terms  are  those  for 
which 

®  j8  +  3  k  —  0. 

Of  the  terms  of  third  order,  only  z3e~u  and  zV  turn  out  to  be  resonant. 
Thus  we  can  reduce  the  hamiltonian  function  to  the  form 

—  2iH  =  —  icozz  +  hz3e  lt  —  hz3eu  +  •  -  • 

(the  conjugacy  of  h  and  h  corresponds  to  the  fact  that  H  is  real). 

Note  that,  in  order  to  reduce  the  hamiltonian  function  to  this  normal 
form,  we  made  a  2?i-periodic  time-dependent  smooth  canonical  transforma¬ 
tion  which  depends  smoothly  on  the  parameter,  even  in  the  case  of  resonance. 
This  transformation  differs  from  the  identity  only  by  terms  that  are  small  of 
second  order  relative  to  the  deviation  from  the  closed  trajectory  (and  its 
generating  function  differs  from  the  generating  function  of  the  identity  only 
by  cubic  terms). 
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Further  investigation  of  the  behavior  of  solutions  of  Hamilton  s  equations 
proceeds  in  the  following  way.  First,  we  throw  out  of  the  hamiltonian  function 
all  terms  of  order  higher  than  three  and  study  the  solutions  of  the  resulting 
truncated  system.  Then  we  must  see  how  the  discarded  terms  can  affect  the 
behavior  of  the  trajectories. 

The  study  of  the  truncated  system  can  be  simplified  by  introducing  a 
coordinate  system  in  the  complex  z-plane  which  rotates  uniformly  with 
angular  velocity  i.e.,  by  the  substitution  z  —  Ce  ^3.  Then  for  the  variable  C 
we  obtain  an  autonomous  hamiltonian  system  with  hamiltonian  function 

-2 iH0  =  -ieCl  +  fc(3  -  K3,  where  i:  =  a>  -  0. 

The  fact  that,  in  a  rotating  coordinate  system,  the  truncated  system  is  autonomous  is  very 
good  luck.  The  total  system  of  Hamilton’s  equations  (including  terms  of  degree  higher  than  three 
in  the  hamiltonian)  is  not  only  not  autonomous  in  a  rotating  coordinate  system,  but  is  not 
even  27t-periodic  (but  only  671-periodic)  in  time.  The  autonomous  system  with  hamiltonian  H0 
is  essentially  the  result  of  averaging  the  original  system  over  closed  trajectories  of  the  linear 
system  with  £  =  0  (where  we  disregard  terms  of  degree  higher  than  three). 

The  coefficient  h  can  be  made  real  (by  a  rotation  of  the  coordinate  system). 
Thus  the  hamiltonian  function  in  the  real  coordinates  (x,  y)  is  reduced  to 
the  form 


H0  =  |  (x2  +  y2)  +  a(x3  -  3 xy2). 

The  coefficient  a  depends  on  the  frequency  deviation  e  as  on  a  parameter. 
For  b  =  0  this  coefficient  is  generally  not  zero.  Therefore,  we  can  make  this 
coefficient  equal  to  1  by  a  smooth  change  of  coordinates  depending  on  a 
parameter.  Thus  we  must  investigate  the  dependence  on  the  small  parameter 
£  of  the  phase  portrait  of  the  system  with  hamilton  function 

H0  =  ^  (x2  +  y2)  +  (x3  -  3xy2) 

in  the  (x,  y)-plane. 

It  is  easy  to  see  that  this  dependence  consists  of  the  following  (Fig.  239). 


Figure  239  Passage  through  resonance  3:1 
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For  £  0  the  zero  level  set  of  the  function  H0  consists  of  three  straight  lines 

through  0,  intersecting  at  angles  of  60°.  Under  a  change  of  £  the  level  line 
always  consists  of  three  straight  lines,  where  these  three  lines  are  moved 
forward  as  £  changes,  always  forming  an  equilateral  triangle  with  center  at 
the  origin.  The  vertices  of  this  triangle  are  saddle  points  of  the  hamiltonian 
function.  As  £  passes  through  zero  (i.e.,  upon  passage  through  resonance), 
the  critical  point  at  the  origin  changes  from  a  minimum  to  a  maximum. 

Thus,  for  a  system  with  hamiltonian  function  H0,  the  origin  is  a  stable 
equilibrium  position  for  all  values  of  the  parameter  except  at  resonance, 
and  at  resonance  the  origin  is  unstable.  For  values  of  the  parameter  close  to 
resonance,  the  triangle  close  to  the  origin  filled  by  closed  phase  curves  is 
small  (of  order  £),  so  the  “radius  of  stability”  of  the  origin  approaches  zero  as 
£  — >  0:  a  small  (of  order  e)  perturbation  of  the  initial  condition  is  sufficient 
to  make  a  phase  point  move  outside  the  triangle  and  begin  to  go  away  from 
the  equilibrium  position. 

Returning  to  the  original  problem  of  the  periodic  trajectory,  we  come  to 
the  following  conclusions  (which,  of  course,  are  not  proven,  since  we  threw 
out  terms  of  degree  higher  than  three,  but  which  can  be  justified): 

1.  At  the  moment  of  passage  through  the  resonance  3:1a  periodic  trajectory 
generally  loses  its  stability. 

2.  For  values  of  the  parameter  close  to  resonance  there  is  an  unstable  periodic 
trajectory  near  the  periodic  trajectory  under  consideration  on  the  same 
energy  level  manifold.  It  is  closed  after  making  three  circulations  along 
the  original  trajectory  and  one  revolution  around  it.  For  the  resonance 

value  of  the  parameter,  this  unstable  trajectory  merges  with  the  original 
one. 

3.  The  distance  of  this  unstable  periodic  trajectory  from  the  original 
decreases,  as  we  approach  resonance,  to  first  order  in  the  frequency 
deviation  (i.e.,  as  the  first  order  of  the  difference  of  the  parameter  from  the 
resonance  value). 

4.  Through  this  unstable  trajectory  on  the  same  three-dimensional  energy 
level  manifold  there  pass  two  two-dimensional  invariant  surfaces, 
filled  with  trajectories  approximating  this  unstable  periodic  trajectory 
as  t  ->  oo  on  one  surface  and  as  t  -*  -  oo  on  the  other. 

5.  The  location  of  the  separatrices  is  such  that,  by  intersecting  with  a  mani¬ 
fold  transversal  to  the  original  trajectory,  we  obtain  a  figure  close  to  the 
three  sides  of  an  equilateral  triangle  and  their  continuations.  The  vertices 
of  the  triangle  are  the  points  of  intersection  of  the  unstable  periodic 
trajectory  with  the  transversal  manifold. 

6.  For  initial  conditions  inside  the  triangle  formed  by  the  separatrices,  a 
phase  point  stays  near  the  original  periodic  trajectory  (at  a  distance  of 
order  £)  for  a  long  time  (of  order  not  less  than  l/£),  and  for  initial  conditions 
outside  the  triangle  it  goes  off  quite  rapidly  to  a  distance  which  is  large  in 
comparison  with  e. 
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E  Splitting  of  separatrices 

In  reality,  the  separatrices  we  talked  about  in  statements  4,  5,  and  6  above 
have  a  very  complicated  structure  (because  of  the  influence  of  the  terms 
of  order  higher  than  three  which  we  disregarded  in  our  approximation).  In 
order  to  understand  the  situation,  it  is  convenient  to  look  at  a  two-dimen¬ 
sional  surface  transversally  intersecting  the  original  closed  trajectory  at 
some  point  on  it  (and  lying  entirely  in  one  energy  level  manifold).106  Trajec¬ 
tories  beginning  on  this  surface  intersect  it  again  after  a  time  close  to  the 
time  of  circulation  around  the  original  closed  trajectory.  Thus  we  have  a 
mapping  of  a  neighborhood  of  the  point  of  intersection  of  the  closed  trajec¬ 
tory  with  the  surface  onto  a  part  of  the  surface.  This  mapping  has  a  fixed 
point  (at  the  point  where  the  closed  trajectory  intersects  the  surface)  and  is 
approximately  a  rotation  by  120°  around  this  point,  which  we  take  for  the 
origin  in  our  surface. 

We  now  consider  the  third  power  of  the  mapping  indicated  above.  This 
is  again  a  mapping  of  some  neighborhood  of  the  origin  to  a  part  of  the  sur¬ 
face,  leaving  the  origin  fixed.  But  now  this  mapping  is  approximately  rotation 
by  360°,  i.e.,  the  identity:  it  is  realized  by  the  trajectories  of  our  system  after 
approximately  three  periods  of  our  closed  trajectory. 

The  calculations  above  give  nontrivial  information  about  the  structure 
of  this  “mapping  after  three  periods.”  In  fact,  by  throwing  out  the  terms  of 
degree  four  and  higher  in  the  hamiltonian  function,  we  change  the  terms  of 
degree  three  and  higher  of  the  mapping.  Therefore,  the  mapping  after  three 
periods  which  corresponds  to  the  truncated  hamiltonian  function  approxi¬ 
mates  (with  cubic  error)  the  actual  mapping  after  three  periods. 

But  we  know  the  properties  of  the  mapping  after  three  periods  correspond¬ 
ing  to  the  truncated  hamiltonian  function,  since  it  is  the  mapping  of  the 
phase  flow  of  the  system  with  hamiltonian  function  H0(x,y )  after  time 
6tt  (the  proof  is  based  on  the  fact  that  after  time  6n  our  rotating  coordinate 
system  returns  to  the  original  position).  We  now  look  at  which  of  these 
properties  are  preserved  for  perturbations  of  third-order  smallness  relative 
to  the  distance  from  the  fixed  point,  and  which  are  not. 

We  let  A0  denote  the  mapping  after  three  periods  for  the  truncated  system, 
and  A  the  actual  mapping  after  three  periods. 

1  The  mapping  Aq  is  included  in  a  flow:  it  is  the  transformation  after  time 
67t  in  the  phase  flow  with  hamiltonian  H0 . 

There  is  no  reason  to  think  that  the  mapping  A  is  included  in  a  flow. 
2.  The  mapping  A0  is  symmetric  under  a  rotation  by  120°:  there  is  a  non¬ 
trivial  diffeomorphism  g  for  which  g3  =  E  and  which  commutes  with  A0. 

There  is  no  reason  to  think  that  the  mapping  A  commutes  with  any 
nontrivial  diffeomorphism  g  satisfying  g 3  =  E. 

106  Here  we  have  the  following  general  phenomenon:  it  is  easier  to  think  about  mappings  after 
a  period,  and  easier  to  calculate  with  flows. 
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3.  The  mapping  A0  has  three  unstable  fixed  points  at  a  distance  s  from  the 
origin,  approximately  the  vertices  of  an  equilateral  triangle.  For  sufficiently 
small  deviations  from  resonance  (i.e.,  for  sufficiently  small  e)  the  mapping 
A  also  has  three  unstable  fixed  points  near  the  vertices  of  an  equilateral 
triangle.  This  follows  from  the  implicit  function  theorem. 

4.  The  separatrices  of  fixed  points  of  the  mapping  A0  form,  for  values  of  the 
parameter  close  to  (but  not  at)  resonance,  a  figure  approximating  the 
sides  and  extended  sides  of  an  equilateral  triangle.  If  we  begin  with  a 
point  on  one  of  the  sides  of  the  triangle,  then  after  repeated  applications 
of  A0  we  obtain  a  sequence  of  points  on  the  same  side  of  the  triangle 
approaching  one  of  the  vertices  bounding  the  side,  say  M0.  Applying 
Ao  \  we  obtain  a  sequence  approaching  the  other  vertex,  which  we  will 
denote  by  N0 . 

Each  of  the  three  unstable  fixed  points  of  the  mapping  A  also  has  separa¬ 
trices  approximating  the  sides  of  a  triangle  (Figure  240).  Namely,  those  points 
of  the  plane  which  approach  the  fixed  point  M  after  applying  the  mappings 
A",  n  -*■  +  oo,  form  a  smooth  curve  T+  invariant  under  A,  passing  through 
M  and,  near  M,  close  to  the  side  M0N0  of  the  separatrices  of  A0.  The  points 
which  approach  N  after  applications  of  A",  where  n  — >  —  oo,  form  another 
smooth  invariant  curve  T“,  passing  through  N  and  also  near  M0N0  near 
N0. 


r"  r- 


Figure  240  Splitting  of  separatrices 


However  the  two  curves  T+  and  T  ,  both  near  the  line  M0 N0,  are  not  at 
all  obliged  to  coincide.  This  is  the  phenomenon  of  splitting  of  separatrices, 
which  accounts  for  the  differing  behavior  of  the  trajectories  of  the  truncated 
and  total  systems. 


The  magnitude  of  the  splitting  of  separatrices  is  exponentially  small  for  small  e;  therefore 
it  is  easy  to  overlook  the  phenomenon  of  splitting  in  calculations  in  one  or  another  scheme  of 
perturbation  theory.”  However,  this  phenomenon  is  very  important  in  fundamental  questions. 
For  example,  its  existence  immediately  implies  the  divergence  of  the  series  in  numerous  versions 
of  perturbation  theory  (since  if  the  series  converged,  there  would  be  no  splitting). 

In  general,  the  divergence  of  series  in  perturbation  theory  (while  a  good  approximation  is 
given  by  a  few  initial  terms)  is  usually  related  to  the  fact  that  we  are  looking  for  an  object  which 
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does  not  exist.  If  we  try  to  fit  a  phenomenon  to  a  scheme  which  actually  contradicts  the  essential 
features  of  the  phenomenon,  then  it  is  not  surprising  that  our  series  diverge. 

The  BirkhofT  series  (which  are  obtained  if  one  continues  infinitely  the  normalizations  ot 
the  initial  terms  of  the  Taylor  series  of  the  hamiltonian  function)  arc  one  example  of  a  formally 
convergent  but  actually  divergent,  scheme  of  perturbation  theory.  If  these  series  converged, 
then  a  general  oscillating  system  with  one  degree  of  freedom  with  periodic  coefficients  would  be 
reduced  near  an  equilibrium  position  to  an  autonomous  normal  form  and  there  would  be  no 
splitting  of  separatrices  in  it  (whereas  in  fact  there  is). 

Returning  to  the  original  closed  trajectory,  we  see  that  the  three  unstable 
fixed  points  of  the  mapping  A  correspond  to  an  unstable  closed  trajectory 
near  the  original  triple.  There  is  a  family  of  trajectories  approaching  this 
unstable  trajectory  as  t  -»  +  oo,  and  another  family  of  trajectories  ap¬ 
proaching  the  unstable  one  as  t  ->  —  oo.  The  points  of  the  trajectories  of 
each  of  these  families  form  a  smooth  surface  containing  our  unstable  trajec- 

tory. 

These  two  surfaces  are  also  the  separatrices  we  talked  about  in  state¬ 
ments  4,  5,  and  6  of  Section  D.  By  intersecting  them  with  our  transversal 
surface  we  obtain  the  invariant  curves  T+  and  T“  of  the  mapping  A.  The 
intersections  of  these  two  curves  form  a  complicated  network  about  which 
H.  Poincare,  who  first  discovered  the  phenomenon  of  splitting  of  separatrices, 
wrote,  “The  intersections  form  a  type  of  lattice,  tissue,  or  grid  with  infinitely 
fine  mesh.  Neither  of  the  two  curves  must  ever  cut  across  itself  again,  but 
it  must  bend  back  upon  itself  in  a  very  complex  manner  in  order  to  cut 
across  all  of  the  squares  in  the  grid  an  infinite  number  of  times. 

“One  will  be  struck  by  the  complexity  of  this  figure,  which  I  shall  not  even 
attempt  to  draw.  Nothing  is  more  suitable  for  providing  us  with  an  idea  of 
the  complex  nature  of  the  three-body  problem,  and  of  all  the  problems  of 
dynamics  in  general,  where  there  is  no  uniform  integral  and  where  the  Bohlin 
series  are  divergent.”  (H.  Poincare,  “Les  Methodes  Nouvelles  de  la  Mechan- 

ique  Celeste,”  Vol.  Ill,  Dover,  1957,  389.) 

We  should  note  that  much  is  still  unclear  about  the  picture  of  intersecting 

separatrices. 

F  Resonances  of  higher  order 

Resonances  of  higher  order  can  also  be  studied  using  a  normal  form.  In 
this  connection,  we  note  that  resonances  of  order  higher  than  4  do  not 
usually  induce  instability,  since  in  the  normal  form  terms  of  degree  4  appear, 
guaranteeing  a  minimum  or  maximum  of  the  function  H0  even  at  resonance. 

In  the  case  of  resonance  of  order  n>  4,  the  typical  development  of  the 
phase  portrait  of  the  system  with  hamiltonian  function  H0  is  given  by  the 
formula 

H0  =  er  +  t2oc(t)  +  ax"12  sin  rup, 

2t  =  p2  +  q2,  «(0)  =  ±  1, 

and  consists  of  the  following  (Figure  241). 
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Figure  241  Averaged  hamiltonian  of  phase  oscillations  near  resonance  5: 1 

For  small  (of  order  e)  deviations  of  the  frequency  from  resonance,  and  at 
a  small  (of  order  distance  from  the  equilibrium  position  at  the  origin, 
the  function  H0  has  2 n  critical  points  near  the  vertices  of  a  regular  n- gon 
with  center  at  the  origin.  Half  of  these  critical  points  are  saddle  points, 
and  the  other  half  are  maxima  if  the  origin  is  a  minimum  or  minima  if  the 
origin  is  a  maximum.  The  saddle  points  and  stable  points  alternate.  All  n 
saddle  points  lie  on  one  level  of  the  function  H0;  their  separatrices,  con¬ 
necting  successive  saddle  points,  form  n  “islands,”  each  of  which  is  filled 
with  closed  phase  curves  encircling  a  stable  point.  The  width  of  the  islands 
is  of  order  £<"/4)-<i/2)  The  cjosecj  phase  Curves  inside  each  island  are  called 
“phase  oscillations”  (since  what  varies  essentially  is  the  phase  of  the  oscilla¬ 
tions  around  the  origin).  The  period  of  the  phase  oscillations  grows  with 
decreasing  frequency  deviation  e  like  £-n/4. 

Inside  the  narrow  ring  formed  by  the  islands,  closer  to  the  origin,  there  are 
closed  phase  curves  encircling  the  origin;  outside  the  ring  the  phase  curves 
are  closed,  but  motion  along  them  proceeds  in  the  direction  opposite 
to  that  inside  the  ring.  We  note  that  the  radius  of  the  ring  has  order 
independently  of  the  order  of  resonance,  if  this  order  is  greater  than  4.  Also, 
the  ring  of  islands  exists  for  only  one  of  the  two  signs  of  e. 

If  we  pass  from  the  truncated  system  with  hamiltonian  H0  to  the  total 
system,  the  separatrices  split  in  a  way  similar  to  that  described  above  for 
resonance  of  order  3.  The  size  of  the  splitting  of  the  separatrices  is  expo¬ 
nentially  small  (or  order  e  1/£  ),  but  the  splitting  is  of  fundamental  im¬ 
portance  for  investigating  stability,  especially  in  the  multi-dimensional  case. 

Returning  to  our  original  closed  trajectory,  we  have  the  following  picture. 
As  we  approach  resonance  along  the  s  axis  from  one  side,107  two  periodic 
trajectories  split  off  from  our  periodic  trajectory:  a  stable  one  and  an  un¬ 
stable  one.  These  new  trajectories  close  up  after  n  circulations  along  the 
original  trajectory  and  lie  at  a  distance  of  order  yfej  from  the  original 
trajectory.  Near  the  stable  trajectory  there  is  a  zone  of  slow  phase  oscillations 

Unlike  resonance  of  order  3,  for  which  there  is  an  unstable  periodic  trajectory  branching 
off  from  both  sides  of  the  resonance. 
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with  period  of  order  s~n/4  and  amplitude  of  order  n/n  in  the  azimuthal 
direction  and  of  order  £(n/4)-(1/2)  in  the  radial  direction.  Loss  of  stability  of  the 
original  periodic  trajectory  at  the  moment  of  passage  through  resonance 
does  not  occur,  at  least  in  the  approximation  which  we  have  considered. 


The  case  of  resonance  of  fourth  order  is  somewhat  exceptional.  In  this  case,  in  the  normal 
form  there  are  both  resonant  and  non-resonant  terms  of  order  4.  The  shape  of  the  phase  curves 
of  the  truncated  system  depends  on  which  of  these  terms  of  the  normal  form  dominates,  a 
resonant  one  or  a  non-resonant  one.  In  the  first  case  the  development  is  the  same  as  for  third- 
order  resonance,  except  that  in  place  of  a  triangle  there  is  a  square.  In  the  second  case  the  develop¬ 
ment  is  the  same  as  for  n  >  4. 

In  conclusion,  we  remark  that  the  given  normal  form  becomes  a  better 
approximation  as  we  get  closer  to  resonance  (e  <  1)  and  as  the  deviation 
of  the  initial  point  from  the  periodic  trajectory  gets  smaller.  That  is,  as  the 
period  of  the  closed  trajectory  and  the  period  of  oscillation  of  neighboring 
trajectories  near  it  become  more  exactly  commensurable,  and  as  the  initial 
condition  approaches  the  closed  trajectory,  the  interval  of  time  grows  on 
which  our  approximation  accurately  describes  the  behavior  of  the  phase 

curves.  .  . 

No  conclusion  about  the  behavior  of  non-closed  phase  curves  on  infinite 
intervals  of  time  (for  example,  about  the  Liapunov  stability  of  the  original 
periodic  trajectory)  follows  from  our  arguments,  since  the  terms  of  higher 
order  which  were  thrown  out  in  reducing  to  normal  form  can,  over  an  infinite 
period  of  time,  completely  change  the  character  of  the  motion.  Actually, 
under  the  conditions  considered,  the  original  periodic  trajectory  is  Liapunov 
stable,  but  the  proof  requires  substantially  new  techniques  beyond  the 
Birkhoff  normal  form  (cf.  Appendix  8). 
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periodic  motion,  and  Kolmogorov’s  theorem 
The  collection  of  solvable  “integrable”  problems  which  we  have  at  our 
disposal  is  not  large  (one-dimensional  problems,  motion  of  a  point  in  a 
central  field,  eulerian  and  lagrangian  motions  of  a  rigid  body,  the  problem  of 
two  fixed  centers,  and  motion  along  geodesics  on  the  ellipsoid).  However, 
with  the  help  of  these  “integrable  cases,”  we  can  obtain  meaningful  informa¬ 
tion  about  motions  of  many  important  systems  by  considering  an  integrable 
problem  as  a  first  approximation. 

An  example  of  such  a  situation  is  the  problem  of  motion  of  the  planets 
around  the  sun  under  the  law  of  universal  gravitation.  The  mass  of  the  planets 
is  approximately  0.001  of  the  mass  of  the  sun,  so  in  a  first  approximation  we 
can  disregard  the  interaction  of  the  planets  on  one  another  and  consider 
only  the  attraction  by  the  sun.  As  a  result,  we  obtain  the  exactly  integrable 
problem  of  the  motion  of  non-interacting  planets  around  the  sun ;  each  planet 
will  describe  its  keplerian  ellipse  independently  of  the  others,  and  the  motion 
of  the  system  as  a  whole  will  be  conditionally  periodic.  If  we  now  consider  the 
interactions  of  the  planets  on  one  another,  the  keplerian  motion  of  each 
planet  will  be  slightly  changed. 

We  call  upon  the  theory  of  perturbations  from  celestial  mechanics  to 
study  this  interaction.  It  is  clear  that  calculations  for  time  of  the  order  of 
1,000  years  do  not  present  any  fundamental  difficulties.  However,  if  we  want 
to  study  longer  intervals  of  time,  and  especially  if  we  are  interested  in  qualita¬ 
tive  questions  about  the  behavior  of  exact  solutions  of  the  equations  of 
motion  on  an  infinite  time  interval,  then  such  difficulties  arise.  The  ac¬ 
cumulation  of  perturbations  after  an  interval  of  time  which  is  large  in 
comparison  to  1,000  years  could  cause  a  complete  change  in  the  character  of 
the  motion :  for  example,  the  planets  could  fall  into  the  sun,  escape  from  it,  or 
collide  with  one  another. 


Note  that  the  question  of  the  behavior  of  solutions  of  the  equations  of  motion  on  an  infinite 
time  interval  has  only  an  indirect  relation  to  the  problem  of  the  motion  of  real  planets.  The 
reason  is  that,  after  intervals  of  billions  of  years,  small  non-conservative  effects  not  considered 
in  Newton’s  equations  become  important.  Thus,  the  effects  of  the  gravitational  interaction  of 
the  planets  are  of  real  importance  only  when  they  seriously  change  the  picture  of  motion  within  a 
finite  time  which  is  small  in  comparison  with  the  time  of  development  of  non-conservative 
effects. 

In  calculating  motion  over  such  finite  times,  computers  prove  to  be  very  useful,  quickly 
determining  the  motion  of  the  planets  for  many  thousands  of  years  in  the  future  or  past.  How¬ 
ever,  we  should  note  that  even  the  application  of  modern  calculating  methods  may  be  insufficient 
to  predict  the  influence  of  perturbations  if  a  phase  point  falls  in  the  zone  of  exponential  in¬ 
stability. 

Asymptotic  and  qualitative  methods  have  even  greater  value  for  the  study  of  charged 
particles  in  magnetic  fields,  since  in  this  situation  a  particle  outstrips  the  computer  and  makes 
so  many  orbits  that  mechanical  calculation  of  its  trajectory  is  impossible  even  in  the  absence  of 
exponential  instability. 


A  whole  series  of  methods  has  been  devised  for  calculating  perturbations 
in  celestial  mechanics.  (A  detailed  analysis  of  them  can  be  found  in  the  book. 
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“Les  Methodes  Nouvelles  de  la  Mecanique  Celeste,”  by  H.  Poincare, 
Dover,  1957.) 

A  difficulty  with  all  of  these  methods  is  that  they  lead  to  divergent  series 
and  therefore  give  no  information  about  the  behavior  of  motion  as  a  whole 
over  infinite  intervals  of  time.  The  reason  for  the  divergence  of  series  in  the 
theory  of  perturbations  is  “small  denominators”:  integral  linear  combina¬ 
tions  of  frequencies  of  unperturbed  motions  by  which  it  is  necessary  to  divide 
in  calculating  the  influence  of  perturbations.  For  exact  resonance  (i.e.,  for 
commensurable  frequencies)  these  denominators  vanish,  and  the  cor¬ 
responding  term  of  the  series  in  the  theory  of  perturbations  becomes  in¬ 
finitely  large.  Close  to  resonance,  this  term  of  the  series  is  very  large. 

Thus,  for  example,  in  their  motion  around  the  sun,  Jupiter  and  Saturn,  in  one  day,  go  through 
approximately  299  and  120.5  seconds  of  arc  respectively.  Therefore,  the  denominator  2or,  -  5a>s 
is  very  small  in  comparison  with  each  of  their  frequencies.  This  amounts  to  a  large  long-period 
perturbation  of  the  planets  on  one  another  (its  period  is  about  800  years);  the  study  by  Laplace 
of  this  effect  was  one  of  the  first  successes  of  the  theory  of  perturbations. 

We  note  that  the  difficulty  caused  by  small  denominators  is  essential.  The 
rational  numbers  form  a  dense  set ;  thus  in  the  phase  space  of  an  unperturbed 
problem,  initial  conditions  for  which  we  have  resonance  and  the  small 
denominators  vanish  form  a  dense  set.  Hence,  the  functions  given  by  the 
series  of  perturbation  theory  have  a  dense  set  of  singular  points. 

The  difficulty  mentioned  here  is  characteristic  not  only  for  problems  of 
celestial  mechanics,  but  for  all  problems  which  are  close  to  integrable  (for 
instance,  for  the  problem  of  an  asymmetrical  rigid  top  under  very  fast  rota¬ 
tion).  Poincare  himself  called  the  problem  of  studying  perturbations  of 
conditionally-periodic  motions  in  a  system  given  by  the  hamiltonian 

H  =  H0(I )  +  eHfl,  t p),  £<1, 

in  action-angle  variables  /  and  (p ,  the  fundamental  problem  of  dynamics.  Here 
H0  is  the  hamiltonian  of  the  unperturbed  problem,  and  eH j.  a  perturbation 
which  is  a  27t-periodic  function  of  the  angle  variables  <pl5 . . . ,  <pn .  In  the  unper¬ 
turbed  problem  (e  =  0)  the  angles  q>  change  uniformly  with  constant 
frequencies 

dH0 

and  all  the  action  variables  are  first  integrals. 

We  must  investigate  the  phase  curves  of  Hamilton’s  equations 

SH  .  dH 
1  ~  dq>  *61 

in  a  phase  space  which  is  a  direct  product  of  a  region  in  n-dimensional  space 
with  coordinates  I  and  the  n-dimensional  torus  with  angular  coordinates  (p. 

A  substantial  advance  in  the  study  of  phase  curves  of  this  perturbed 
problem  was  begun  in  1954  with  the  work  of  A.  N.  Kolmogorov  in  On  con- 
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servation  of  conditionally-periodic  motions  for  a  small  change  in  Hamilton’s 
function,”  Dokl.  Akad.  Nauk  SSSR  98:4  (1954)  525-530  (Russian).  In  this 
appendix  we  present  the  basic  results  obtained  since  then  in  this  area.  The 
proofs  can  be  found  in  the  following  works: 

V.  I.  Arnold,  “Small  denominators  I,  Mapping  the  circle  onto  itself,"  Izv.  Akad.  Nauk  SSSR 
Ser.  Mat.  25  (1961),  21-86. 

V.  I.  Arnold,  “Small  denominators  II,  Proof  of  a  theorem  of  A.  N.  Kolmogorov  on  the  preserva¬ 
tion  of  conditionally-periodic  motions  under  a  small  perturbation  of  the  Hamiltonian," 
Russian  Math.  Surveys  18:5  (1963). 

V.  I.  Arnold,  “Small  denominators  III.  Small  denominators  and  problems  of  stability  of  motion 
in  classical  and  celestial  mechanics."  Russian  Math.  Surveys  18:6  (1963). 

V.  I.  Arnold,  A.  Avez,  Ergodic  problems  of  classical  mechanics,  New  York,  Benjamin,  1968. 

J.  Moser,  On  invariant  curves  of  area-preserving  mappings  of  an  annulus  (Nachr.  Akad.  Wiss. 
Gottingen,  Math.  Phys.  K1  Ila,  (1962)  1-20). 

J.  Moser,  A  rapidly  converging  iteration  method  and  nonlinear  differential  equations,  (Annali 
della  Scuola  Norm.  Sup.  de  Pisa,  (3),  20  (1966),  265-315:  (1966),  499-535. 

J.  Moser,  Convergent  series  expansions  for  quasi-periodic  motions,  Math.  Ann.  169  (1967), 
136-176. 

C.  L.  Siegel,  J.  K.  Moser,  Lectures  on  Celestial  Mechanics,  Springer-Verlag,  1971. 

S.  Sternberg,  Celestial  Mechanics,  I,  II,  New  York,  Benjamin,  1969. 

Before  formulating  our  results,  we  will  briefly  discuss  the  behavior  of 
phase  curves  in  the  unperturbed  problem  already  studied  in  Chapter  10. 

A  Unperturbed  motion 

The  system  with  hamiltonian  H0(I )  has  n  first  integrals  in  involution  (the  n 
action  variables).  Every  level  set  of  all  these  integrals  is  an  n-dimensional 
torus  in  2n-dimensional  phase  space.  This  torus  is  invariant  with  respect  to 
the  phase  flow  of  the  unperturbed  system:  every  phase  curve  starting  at  a 
point  of  our  torus  remains  on  it. 

The  motion  of  a  phase  point  on  the  invariant  torus  I  =  const  is  condi¬ 
tionally-periodic.  The  frequencies  of  this  motion  are  the  derivatives  of  the 
unperturbed  hamiltonian  with  respect  to  the  action  variables: 

dH0 

<Pk  -  wk(I\  where  o)k  =  — — . 

oh 

Therefore,  the  phase  curve  densely  fills  a  torus  whose  dimension  is  equal 
to  the  number  of  frequencies  cok  which  are  arithmetically  independent. 

We  note  that  the  frequencies  depend  on  which  torus  we  are  looking  at; 
i.e.,  which  values  of  the  first  integrals  we  have  fixed.  A  system  of  n  functions 
co  of  n  variables  I  is  generally  functionally  independent;  in  such  a  case  we 
can  simply  number  the  tori  by  their  frequencies,  choosing  the  variables  co 
for  coordinates  in  a  neighborhood  of  the  point  under  consideration  in  the 
space  of  action  variables  /. 
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The  case  when  the  frequencies  are  functionally  independent  will  be  called 
the  nondegenerate  case.  The  conditions  for  nondegeneracy  have  the  form 


.8(0  d2H  o 

det  -si  =d  ~W 


Thus,  in  the  nondegenerate  case,  the  unperturbed  problem  determines  on  the 
different  invariant  tori  in  phase  space  conditionally-periodic  motions  with 
different  frequencies.  In  particular,  the  invariant  tori  on  which  the  number  of 
frequencies  is  maximal  (i.e.,  n)  form  a  dense  set  in  phase  space;  such  tori  are 
called  non-resonant  tori. 

It  can  be  shown  that  the  non-resonant  tori  form  a  set  of  full  measure, 
i.e.,  the  Lebesgue  measure  of  the  union  of  all  invariant  resonant  tori  of  the 
unperturbed  non-degenerate  system  is  equal  to  zero.  Nevertheless,  invariant 
resonant  tori  exist  and  are  mixed  in  with  the  non-resonant  tori  in  such  a  way 
that  they  too  form  a  dense  set.  Furthermore,  the  set  of  resonant  tori  with  any 
number  of  independent  frequencies  from  1  to  n  —  1  is  dense.  In  particular, 
the  invariant  tori  on  which  all  phase  curves  are  closed  (the  number  of  in¬ 
dependent  frequencies  is  1)  form  a  dense  set.  Nevertheless,  we  note  that  the 
probability  of  landing  on  a  resonant  torus  by  a  random  choice  of  initial 
point  in  the  phase  space  of  the  unperturbed  system,  is  equal  to  zero  (since  the 
probability  of  landing  on  a  rational  number  by  a  random  choice  of  a  real 
number  is  zero).  Thus,  by  disregarding  sets  of  measure  zero,  we  can  say  that 
almost  all  invariant  tori  in  a  nondegenerate  unperturbed  system  are  non¬ 
resonant  and  have  a  total  set  of  n  arithmetically  independent  frequencies. 

On  a  non-resonant  torus,  the  trajectory  of  a  conditionally-periodic  motion 
is  dense.  Thus,  for  almost  all  initial  conditions,  a  phase  curve  of  a  non-de¬ 
generate  unperturbed  system  densely  fills  an  invariant  torus  whose  dimension 
is  equal  to  the  number  of  degrees  of  freedom  (i.e.,  half  the  dimension  of  the 
phase  space). 

To  better  understand  the  whole  picture,  we  consider  the  case  of  two 
degrees  of  freedom  (n  =  2).  In  this  case,  the  phase  space  is  four-dimensional 
so  each  energy  level  set  is  three-dimensional.  We  fix  one  such  level  set.  This 
three-dimensional  manifold,  fibered  by  two-dimensional  tori,  can  be  repre¬ 
sented  in  ordinary  three-dimensional  space  as  a  family  of  concentric  tori 
lying  inside  one  another  (Figure  242). 
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The  phase  curves  are  windings  of  these  tori ;  both  frequencies  of  circulation 
change  from  torus  to  torus.  In  general,  not  only  both  frequencies  but  also 
their  ratio  will  change  from  torus  to  torus.  If  the  derivative  of  the  ratio  of 
frequencies  with  respect  to  the  action  variable  numbering  the  tori  on  the 
given  level  set  of  the  function  H0  is  not  zero,  then  we  say  that  our  system  is 
isoenergetically  nondegenerate.  The  condition  for  isoenergetic  nondegeneracy 
has  (as  is  easy  to  calculate)  the  form 


det 


d2H o  8H0 
8I2 

dH0 


dl 


81 

0 


±  0. 


The  conditions  for  nondegeneracy  and  isoenergetic  nondegeneracy  are  independent  from 
one  another:  i.e.,  a  nondegenerate  system  could  be  isoenergetically  degenerate,  and  an  iso¬ 
energetically  nondegenerate  system  could  be  degenerate.  In  the  many-dimensional  case  (n  >  2) 
isoenergetic  nondegeneracy  means  nondegeneracy  of  the  following  mapping  of  the  (n  -  1)- 
dimensional  level  manifold  of  the  function  H0  of  n  action  variables  to  the  projective  space  of 
dimension  n  —  1 : 


/  ->  (oj,(/):oj2(/):  ■  ■  ■ :  wn(I)). 

Now  consider  an  isoenergetically  nondegenerate  system  with  two  degrees 
of  freedom.  It  is  easy  to  construct  a  two-dimensional  plane  in  the  three- 
dimentional  energy  level  set  transversally  intersecting  the  two-dimensional 
tori  of  our  family  (in  a  family  of  concentric  circles  in  the  model  in  three- 
dimensional  euclidean  space). 

A  phase  curve  beginning  in  such  a  plane  returns  to  it  after  making  a 
circuit  around  the  torus.  As  a  result  we  obtain  a  new  point  on  the  same  circle 
in  which  the  torus  intersects  the  plane.  In  this  way  there  arises  a  mapping  of 
the  plane  to  itself. 

This  mapping  of  the  plane  to  itself  fixes  the  concentric  meridian  circles  in 
which  the  plane  intersects  the  invariant  tori.  Every  circle  is  rotated  through 
some  angle,  namely  through  that  fraction  of  an  entire  revolution  that  the 
frequency  along  the  meridian  constitutes  of  the  frequency  along  the  equator. 

If  the  system  is  isoenergetically  nondegenerate,  the  angle  of  revolution  of 
invariant  circles  in  the  plane  of  intersection  changes  from  one  circle  to 
another.  Therefore,  on  some  circles  this  angle  will  be  commensurable  with  a 
whole  revolution,  and  on  others  it  will  be  incommensurable.  Each  of  these 
classes  of  circles  will  form  a  dense  set,  but  on  almost  all  circles  (in  the  sense  of 
Lebesgue  measure)  the  angle  of  rotation  will  be  incommensurable  with  a 
whole  revolution. 

The  commensurability  or  incommensurability  is  manifested  in  the  follow¬ 
ing  way  on  the  behavior  of  points  of  a  circle  under  the  mapping  of  the  region 
to  itself.  If  the  angle  of  rotation  is  commensurable  with  a  whole  rotation,  then 
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after  several  iterations  of  the  mapping  the  point  will  return  to  its  initial 
position  (the  number  of  iterations  will  be  larger  as  the  denominator  of  the 
fraction  expressing  the  angle  of  rotation  is  larger).  If  the  angle  of  rotation  is 
incommensurable  with  a  whole  rotation,  the  successive  images  of  the  point 
under  repetitions  of  the  mapping  will  densely  fill  up  the  meridian  circle. 

We  note  further  that  commensurability  corresponds  to  resonant  tori  and 
incommensurability  to  non-resonant  tori.  Also,  the  existence  of  resonant 
tori  implies  the  following  property.  Consider  some  power  of  the  mapping  of 
our  region  to  itself  induced  by  the  phase  curves.  Let  the  exponent  be  the 
denominator  of  the  fraction  expressing  the  ratio  of  the  frequencies  on  one  of 
the  resonant  tori.  Then  the  mapping  raised  to  the  indicated  power  has  a 
whole  circle  consisting  entirely  of  fixed  points  (namely,  the  meridian  of  the 
resonant  torus  under  consideration). 

Such  behavior  of  fixed  points  is  unnatural  for  mappings  in  any  sort  of 
general  form,  even  canonical  mappings  (fixed  points  are  usually  isolated). 
In  the  given  case,  a  whole  circle  of  fixed  points  arises  because  we  have  con¬ 
sidered  an  unperturbed  integrable  system.  For  arbitrarily  small  perturbations 
of  general  form,  this  property  of  the  mapping  (having  a  whole  circle  of  fixed 
points)  must  fail.  The  circle  of  fixed  points  must  be  dispersed  so  that  only  a 
finite  number  remain. 

In  other  words,  under  small  perturbations  of  our  integrable  system,  we 
expect  a  change  in  the  qualitative  picture  of  the  phase  curves,  if  only  in  the 
respect  that  entire  invariant  tori  filled  out  by  closed  phase  curves  will  dis¬ 
integrate  so  that  there  remain  only  a  finite  number  of  closed  curves,  near 
those  for  the  unperturbed  system,  and  the  remaining  phase  curves  will  be 
more  complicated.  We  have  already  encountered  such  a  case  in  Appendix  7 
in  investigating  phase  oscillations  near  resonance. 

We  now  consider  what  happens  to  non-resonant  invariant  tori  under  a 
small  perturbation  of  a  hamiltonian  function.  Formal  application  of  the 
principle  of  averaging  (i.e.,  the  first  approximation  of  the  classical  theory  of 
perturbations,  cf.  Section  52)  leads  us  to  the  conclusion  that  a  non-resonant 
torus  does  not  undergo  any  evolution. 


We  note  that  the  fact  that  the  perturbations  are  hamiltonian  is  essential,  since  for  non¬ 
conservative  perturbations  it  is  clear  that  the  action  variables  may  evolve.  In  celestial  mechanics, 
their  evolution  means  a  secular  change  in  the  major  semi-axes  of  the  keplerian  ellipses,  i.e.,  the 
planets  falling  into  the  sun,  colliding,  or  escaping  to  a  large  distance  in  a  time  which  is  inversely 
proportional  to  the  size  of  the  perturbation.  If  conservative  perturbations  led  to  evolutions  in 
a  first  approximation,  this  would  manifest  itself  in  the  fate  of  the  planets  after  a  time  on  the 
order  of  1,000  years.  Fortunately,  the  order  of  magnitude  of  the  non-conservative  perturbations 
is  much  less. 


The  theorem  of  Kolmogorov,  formulated  below,  furnishes  one  justification 
for  the  conclusion,  drawn  from  the  non-rigorous  theory  of  perturbations, 
about  the  absence  of  evolution  of  action  variables. 
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B  Invariant  tori  in  a  perturbed  system 

Theorem.  If  an  unperturbed  system  is  nondegenerate,  then  for  sufficiently 
small  conservative  hamiltonian  perturbations,  most  non-resonant  invariant 
tori  do  not  vanish,  but  are  only  slightly  deformed,  so  that  in  the  phase  space 
of  the  perturbed  system,  too,  there  are  invariant  tori  densely  filled  with  phase 
curves  winding  around  them  conditionally-periodically ,  with  a  number  of 
independent  frequencies  equal  to  the  number  of  degrees  of  freedom. 

These  invariant  tori  form  a  majority  in  the  sense  that  the  measure  of  the 
complement  of  their  union  is  small  when  the  perturbation  is  small. 

A.  N.  Kolmogorov’s  proof  of  this  theorem  is  based  on  the  following  two 
observations. 

1.  We  fix  a  non-resonance  set  of  frequencies  of  the  unperturbed  system  so 
that  the  frequencies  are  not  only  independent,  but  do  not  even  approximately 
satisfy  any  resonance  conditions  of  low  order.  More  precisely,  we  fix  a  set 
of  frequencies  a>  for  which  there  exist  C  and  v  such  that  |(a>,  k)  \  >  C\k\~v 
for  all  integral  vectors  k  ^  0. 

It  can  be  shown  that,  if  v  is  sufficiently  large  (say  v  =  n  +  1),  then  the 
measure  of  the  set  of  such  vectors  co  (lying  in  a  fixed  bounded  region)  for 
which  the  indicated  condition  of  non-resonance  is  violated,  is  small  when  C 
is  small. 

Next,  near  a  non-resonant  torus  of  the  unperturbed  system  corresponding 
to  a  fixed  value  of  the  frequencies,  we  will  look  for  an  invariant  torus  of  the 
perturbed  system  on  which  there  is  conditionally-periodic  motion  with 
exactly  the  same  frequencies  as  the  ones  we  fixed,  and  which  necessarily 
satisfy  the  condition  of  being  non-resonant  described  above. 

In  this  way,  instead  of  the  variations  of  frequency  customary  in  perturba¬ 
tion  schemes  (consisting  of  the  introduction  of  frequencies  depending  on  the 
perturbation),  we  must  hold  constant  the  non-resonant  frequencies,  while 
selecting  initial  conditions  depending  on  the  perturbation  in  order  to 
guarantee  motion  with  the  given  frequencies.  This  can  be  done  by  a  small 
(when  the  perturbation  is  small)  change  of  initial  conditions,  because  the 
frequencies  change  with  the  action  variables  according  to  the  non-degen¬ 
eracy  condition. 

2.  The  second  observation  is  that,  to  find  an  invariant  torus,  instead  of 
using  the  usual  series  expansion  in  powers  of  the  perturbation  parameter,  we 
can  use  a  rapidly  convergent  method  similar  to  Newton’s  method  of  tangents. 

Newton’s  method  of  tangents  for  finding  roots  of  algebraic  equations  with 
initial  error  e  gives,  after  n  approximations,  an  error  of  order  e2".  Such 
super-convergence  allows  us  to  paralyze  the  influence  of  the  small  denomin¬ 
ators  appearing  in  every  approximation,  and  in  the  end  succeeds  not  only  in 
carrying  out  an  infinite  number  of  approximations,  but  also  in  showing  the 
convergence  of  the  entire  procedure. 
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The  assumption  under  which  all  this  can  be  done  is  that  the  unperturbed 
hamiltonian  function  H0(I)  is  analytic  and  nondegenerate,  and  the  perturbing 
hamiltonian  function  sHfl,  (p )  is  analytic  and  27i-periodic  in  the  angle  vari¬ 
ables  (p .  The  presence  of  the  small  parameter  s  is  immaterial :  it  is  important 
only  that  the  perturbation  be  sufficiently  small  in  some  complex  neighbor¬ 
hood  of  radius  p  of  the  real  plane  of  the  variables  (p  (less  than  some  positive 
function  M(p,  H0 )). 

As  J.  Moser  showed,  the  requirement  of  analyticity  can  be  changed  to 
differentiability  of  sufficiently  high  order  if  we  combine  Newton’s  method 
with  an  idea  of  J.  Nash,  the  application  of  a  smoothing  operator  at  each 
approximation. 

The  resulting  conditionally-periodic  motions  of  the  perturbed  system  with 
fixed  frequencies  c o  turn  out  to  be  smooth  functions  of  the  parameter  e  of 
perturbation.  Therefore,  they  could  have  been  sought,  without  Newton’s 
method,  in  the  form  of  a  series  in  powers  of  e.  The  coefficients  of  this  series, 
called  the  Lindstedt  series,  can  actually  be  found;  however,  we  can  prove  its 
convergence  only  indirectly,  with  the  help  of  newtonian  approximations. 

C  Zones  of  instability 

The  presence  of  invariant  tori  in  the  phase  space  of  the  perturbed  problem 
means  that,  for  most  initial  conditions  in  a  system  which  is  nearly  integrable, 
motion  remains  conditionally  periodic  with  a  maximal  set  of  frequencies. 

The  question  naturally  arises  of  what  happens  to  the  remaining  phase 
curves,  with  initial  conditions  falling  into  the  gaps  between  the  invariant  tori 
which  replace  the  resonant  invariant  tori  of  the  non-perturbed  problem. 

The  disintegration  of  a  resonant  torus  on  which  the  number  of  frequencies 
is  one  less  than  the  maximum  is  easy  to  investigate  in  a  first-order  perturba¬ 
tion  theory.  To  do  this,  we  must  average  the  perturbation  over  the  (n  —  1)- 
dimensional  invariant  tori  into  which  the  resonant  invariant  torus  is 
decomposed  and  which  are  densely  filled  out  by  phase  curves  of  the  un¬ 
perturbed  system.  After  averaging,  we  obtain  a  conservative  system  with  one 
degree  of  freedom  (cf.  the  investigation  of  phase  oscillations  near  resonance 
in  Appendix  7),  which  is  easy  to  study. 

In  the  approximation  under  consideration  we  have,  near  the  n-dimensional 
reducible  torus,  stable  and  unstable  ( n  —  l)-dimensional  tori,  with  phase 
oscillations  around  the  stable  ones.  The  corresponding  conditionally- 
periodic  motions  have  a  full  set  of  n  frequencies,  of  which  n  —  1  are  the  fast 
frequencies  of  the  original  oscillations  and  one  is  the  slow  (of  order  yfl) 
frequency  of  the  phase  oscillations. 

However,  one  must  not  conclude  that  the  only  difference  between  motions 
in  the  unperturbed  and  perturbed  systems  is  the  appearance  of  “islands” 
of  phase  oscillations.  In  fact,  the  actual  phenomena  are  much  more  compli¬ 
cated  than  the  first  approximation  described  above.  One  manifestation  of 
this  complicated  behavior  of  the  phase  curves  of  the  perturbed  problem  is 
the  splitting  of  separatrices  discussed  in  Appendix  7. 
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To  study  motions  of  a  perturbed  system  outside  of  the  invariant  tori  we 
must  distinguish  the  cases  of  two  and  higher  degrees  of  freedom.  For  two 
degrees  of  freedom,  the  dimension  of  the  phase  space  is  four,  and  an  energy 
level  manifold  is  three-dimensional.  Therefore,  the  invariant  two-dimensional 
tori  divide  each  energy  level  set.  Thus,  a  phase  curve  beginning  in  the  gap 
between  two  invariant  tori  of  the  perturbed  system  remains  forever  confined 
between  those  tori.  No  matter  how  complicated  this  curve  appears,  it  does 
not  leave  its  gap,  and  the  corresponding  action  variables  remain  forever  near 
their  initial  conditions. 

If  the  number  n  of  degrees  of  freedom  is  greater  than  two,  the  n-dimen- 
sional  invariant  tori  do  not  divide  the  (2 n  —  l)-dimensional  energy  level 
manifold  but  are  arranged  in  it  like  points  on  a  plane  or  lines  in  space.  In  this 
case  the  “gaps”  corresponding  to  different  resonances  are  connected  to  one 
another,  so  the  invariant  tori  do  not  prevent  phase  curves  starting  near 
resonance  from  going  far  away.  Hence,  there  is  no  reason  to  expect  that  the 
action  variables  along  such  a  phase  curve  will  remain  close  to  their  initial 
values  for  all  time. 

In  other  words,  under  sufficiently  small  perturbations  of  systems  with 
two  degrees  of  freedom  (satisfying  the  generally  fulfilled  condition  of  iso- 
energetic  nondegeneracy),  not  only  do  the  action  variables  along  a  phase 
trajectory  have  no  secular  perturbations  in  any  approximation  of  perturba¬ 
tion  theory  (i.e.,  they  change  little  in  a  time  interval  on  the  order  of  (l/s)^  for 
any  N,  where  e  is  the  magnitude  of  the  perturbation),  but  these  variables 
remain  forever  near  their  initial  values.  This  is  true,  both  for  non-resonant 
phase  curves  conditionally-periodically  filling  out  two-dimensional  tori  (and 
comprising  most  of  the  phase  space),  and  for  the  remaining  initial  conditions. 

At  the  same  time,  there  exist  systems  with  more  than  two  degrees  of 
freedom  satisfying  all  the  nondegeneracy  conditions,  in  which,  although  for 
most  initial  conditions  motion  is  conditionally  periodic,  for  some  initial 
conditions  a  slow  drift  of  the  action  variables  away  from  their  initial  values 
occurs.  The  average  velocity  of  this  drift  in  known  examples108  is  on  the 
order  of  e-1/V£,  i.e.,  this  velocity  decreases  faster  than  any  power  of  the 
perturbation  parameter.  Thus  it  is  not  surprising  that  this  drifting  away  does 
not  appear  in  any  approximation  of  perturbation  theory.  (By  average  vel¬ 
ocity,  we  mean  the  ratio  of  the  increase  of  action  variables  to  time,  so  that 
we  are  actually  dealing  with  an  increase  of  order  1  after  a  time  of  order  e1^). 

An  upper  bound  on  the  average  velocity  of  the  drift  of  the  action  variables 
in  general  nearly  integrable  systems  of  hamiltonian  equations  with  n  degrees 
of  freedom  is  included  in  the  recent  work  of  N.  N.  Nehoroshev.109 


108  Cf.  V.  I.  Arnold,  Instability  of  dynamical  systems  with  many  degrees  of  freedom.  Soviet 
Mathematics  5:3  (1964)  581-585. 

109  N.  N.  Nehoroshev,  The  behavior  of  hamiltonian  systems  that  are  close  to  integrable  ones. 
Functional  Analysis  and  Its  Applications,  5:4  (1971);  Uspekhi  Mat.  Nauk  32:6  (1977). 
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This  bound,  like  the  lower  bound  mentioned  above,  has  the  form  e~!/£d; 
thus  the  increase  of  the  action  variables  is  small  while  the  time  is  small  in 
comparison  with  el/£d,  if  e  <  e0.  Here  e  is  the  magnitude  of  the  perturbation, 
and  d  is  a  number  between  0  and  1  defined,  like  e0,  by  the  properties  of  the 
unperturbed  hamiltonian  H0.  In  addition,  a  nondegeneracy  condition  is 
imposed  on  the  unperturbed  hamiltonian  (this  condition  has  a  long  formula¬ 
tion,  but  is  generally  satisfied;  in  particular,  strong  convexity  of  the  un¬ 
perturbed  hamiltonian  is  sufficient,  i.e.,  positive  or  negative  definiteness  of 
the  second  differential  of  H0 ). 

From  this  upper  bound  it  is  clear  that  secular  changes  of  the  action  vari¬ 
ables  are  not  detected  by  any  approximation  of  perturbation  theory,  since 
the  average  velocity  of  these  changes  is  exponentially  small.  We  note  also 
that  secular  changes  of  the  action  variables  obviously  have  no  directional 
character,  but  are  represented  by  more  or  less  random  wandering  in  the 
resonant  regions  between  the  invariant  tori.  A  more  detailed  discussion  of 
the  questions  arising  here  can  be  found  in  the  article,  “Stochastic  instability 
of  nonlinear  oscillations,”  by  G.  M.  Zaslavski  and  B.  V.  Chirikov,  Soviet 
Physics  Uspekhi,  v.  105,  no.  1  (1971),  3  39. 


D  Variants  of  the  theorem  on  invariant  tori 

Statements  analogous  to  the  theorem  on  conservation  of  invariant  tori  in  an 
autonomous  system  have  been  proved  for  non-autonomous  equations  with 
periodic  coefficients  and  for  symplectic  mappings.  Analogous  statements  are 
valid  in  the  theory  of  small  oscillations  in  a  neighborhood  of  an  equilibrium 
position  of  an  autonomous  system  or  a  system  with  periodic  coefficients,  as 
well  as  in  a  neighborhood  of  a  closed  phase  curve  of  a  phase  flow  or  in  a 
neighborhood  of  a  fixed  point  of  a  symplectic  mapping. 

The  nondegeneracy  conditions  necessary  in  the  various  cases  are  different. 
For  reference,  we  will  now  give  these  nondegeneracy  conditions.  We  will 
limit  ourselves  to  the  simplest  requirements  of  nondegeneracy,  which  are  all 
fulfilled  by  systems  in  “general  position.”  In  many  cases,  the  requirements 
of  nondegeneracy  can  be  weakened,  but  the  advantage  gained  by  this  is  offset 
by  the  complication  of  the  formulas. 

1.  Autonomous  systems.  The  hamiltonian  function  is 

H  =  H0(I )  +  eHfl,  (p),  I  eG  cz  R",  q>  mod  2n  e  Tn. 


The  nondegeneracy  condition 


det 


d2H0 

dl2 


7*  0 


guarantees  preservation110  of  most  invariant  tori  under  small  perturbations 

(e  «  1). 


1 10  It  is  understood  that  the  tori  are  slightly  deformed  under  perturbations. 
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The  condition  for  isoenergetic  nondegeneracy 


det 


32H  o 
dl2 

cH0 


(I 


dH  Q 

dl 


±  0 


0 


guarantees  the  existence  on  every  energy  level  manifold  of  a  set  of  invariant 
tori  whose  complement  has  small  measure.  The  frequencies  on  these  tori 
generally  depend  on  the  size  of  the  perturbation,  but  the  ratios  of  frequencies 
are  preserved  under  changes  in  e. 

If  n  =  2,  then  the  condition  for  isoenergetic  nondegeneracy  also  guarantees 
stability  of  the  action  variables,  in  the  sense  that  they  remain  forever  close  to 
their  initial  values  for  sufficiently  small  perturbations. 

2.  Periodic  systems.  The  hamiltonian  function  is 

H  =  H0(I)  +  eHiil,  (p,t),  /eGc  IRn,  <p  mod  2neT"; 

the  perturbation  is  27i-periodic  not  only  in  <p ,  but  also  in  t.  It  is  natural  to  look 
at  the  unperturbed  system  in  the  (In  +  l)-dimensionaI  space  {(/,  <p,  r)}  = 
IR"  x  Tn+i.  The  invariant  tori  have  dimension  n  +  1.  The  nondegeneracy 
condition 


det 


d2H  q 
dl2 


^0 


guarantees  the  preservation  of  most  ( n  +  l)-dimensional  invariant  tori  under 
a  small  perturbation  (e  1). 

If  i7  =  1,  this  nondegeneracy  condition  also  guarantees  stability  of  the 
action  variable,  in  the  sense  that  it  remains  forever  near  its  initial  value  for 
sufficiently  small  perturbations. 

3.  Mappings  (I,  cp)  -*■  (/',  cp')  of  the  “ 2n-dimensional  annulus The  gener¬ 
ating  function  is 

S(I',  q>)  =  50(/')  +  cSfl',  (p),  /'eG  c  I",  ipe  T". 


The  nondegeneracy  condition 


det 


32S0 

di2 


#  o 


guarantees  the  preservation  of  most  invariant  tori  of  the  unperturbed  map¬ 
ping  (I,  (p)  -*■  (/,  cp  +  (dSJdl)  under  small  perturbations  (e  1). 

If  n  =  1,  we  obtain  an  area-preserving  mapping  of  the  ordinary  annulus  to 
itself.  The  unperturbed  mapping  is  represented  on  each  circle  /  =  const  as  a 
rotation.  In  this  case  the  nondegeneracy  condition  means  that  the  angle  of 
rotation  changes  from  one  circle  to  another. 

The  invariant  tori  in  the  case  n  =  1  are  ordinary  circles.  In  this  case,  the 
theorem  guarantees  that  under  iterations  of  the  mapping  all  the  images  of  a 
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point  will  remain  near  the  circle  on  which  the  original  point  lay,  if  the 
perturbation  is  sufficiently  small. 

4.  Neighborhoods  of  equilibrium  positions  ( autonomous  case).  An  equili¬ 
brium  position  is  assumed  to  be  stable  in  a  linear  approximation  so  that  n 
characteristic  frequencies  a>1, . . .  ,a>„  are  defined.  We  assume  that  there  are  no 
resonance  relations  among  the  characteristic  frequencies,  i.e.,  no  relations 

+  •  •  •  +  kncon  =  0  with  integers  kt  such  that  0  <  £  \  ki\  <4. 

Then  the  hamiltonian  function  can  be  reduced  to  the  Birkhoff  normal  form 
(cf.  Appendix  7) 

H  =  H0(  t)  +  •  •  • , 

where  H0( r)  -  Z  +  j  Yj  c°ktTkTi  and  the  dots  denote  terms  of  degree 
higher  than  four  with  respect  to  the  distance  from  the  equilibrium  position. 
The  nondegeneracy  condition 

det  |  cofcJ  |  t k  0 

guarantees  the  existence  of  a  set  of  invariant  tori  of  almost  full  measure  in  a 
sufficiently  small  neighborhood  of  the  equilibrium  position. 

The  condition  for  isoenergetic  nondegeneracy, 


det 


mkl 

COi 


Q)k 

0 


7*  0, 


guarantees  the  existence  of  such  a  set  of  invariant  tori  on  every  energy  level 
set  (sufficiently  close  to  the  critical  point). 

In  the  case  n  =  2,  the  condition  for  isoenergetic  nondegeneracy  is  satisfied 
if  the  quadratic  part  of  the  function  H0  is  not  divisible  by  the  linear  part.  In 
this  case,  isoenergetic  nondegeneracy  guarantees  Liapunov  stability  of  the 
equilibrium  position. 

5.  N eighborhoods  of  equilibrium  positions  (periodic  case).  Here  again  we 
assume  stability  in  a  linear  approximation,  so  that  n  characteristic  fre¬ 
quencies  cul5  . . . ,  co„  are  defined.  We  assume  that  there  are  no  resonance 
relations 


kl aq  +  •••-!-  kncon  +  k0  =  0  with  0  <  £  |/c,|  <  4 

;=  i 

among  the  characteristic  frequencies  and  the  frequency  of  the  time-depen¬ 
dence  of  the  coefficients  (which  we  will  assume  equal  to  1). 

Then  the  hamiltonian  function  can  be  reduced  to  a  Birkhoff  normal  form 
in  the  same  way  as  in  the  autonomous  case,  but  with  27t-periodicity  with 
respect  to  time  in  the  remainder  term. 

The  nondegeneracy  condition 

det|ft>k,|  7^  0 
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guarantees  the  existence  of  (n  +  l)-dimensional  invariant  tori  in  the  (In  +  1)- 
dimensional  extended  phase  space,  near  the  circle  x  =  0  representing  the 
equilibrium  position. 

In  the  case  n  =  1  the  nondegeneracy  condition  reduces  to  the  non-vanish¬ 
ing  of  the  derivative  of  the  period  of  small  oscillations  with  respect  to  the 
square  of  the  amplitude  of  small  oscillations.  In  this  case,  nondegeneracy 
guarantees  that  the  equilibrium  position  is  Liapunov  stable. 

6.  Fixed  points  of  mappings.  Here  we  assume  that  all  2  n  eigenvalues  of  the 
linearization  of  a  canonical  mapping  at  a  fixed  point  have  modulus  1  and  do 
not  satisfy  any  low-order  resonance  relations  of  the  form: 

k\'  •  •  -A*"  —  1,  \kk  I  +  ■  •  •  +  \kn\  <  4 

(where  the  2n  eigenvalues  are  A1? . . . ,  A„,  Ax, . . . ,  /„). 

Then  if  we  disregard  terms  of  higher  than  third  order  in  the  Taylor  series 
at  the  fixed  point,  the  mapping  can  be  written  in  Birkhoflf  normal  form 

3S 

(t,  (P )  ^  (t ,(P  +  x(x)l  where  a(r)  =  — , 

O  T 

S  =  Y,  a)kTk  +  i  Z  a)kiTkTi  (the  usual  coordinates  in  a  neighborhood  of  the 

equilibrium  position  are  pk  =  v /2xk  cos  (pk,  qk  =  ^f2lk  sin  (pk ). 

The  nondegeneracy  condition 

det  |  tok,  |  ^  o 

guarantees  the  existence  of  n-dimensional  invariant  tori  (close  to  the  tori 
t  =  const),  forming  a  set  of  almost  full  measure  in  a  sufficiently  small 
neighborhood  of  the  equilibrium  position. 

If  n  —  1,  we  have  a  mapping  of  the  ordinary  plane  to  itself,  and  the 
invariant  tori  become  circles.  The  nondegeneracy  condition  means  that,  for 
the  normal  form,  the  derivative  of  the  angle  of  rotation  of  a  circle  with  respect 
to  the  area  bounded  by  the  circle  is  not  zero  (at  the  fixed  point  and,  therefore, 
in  some  neighborhood  of  it). 

In  the  case  n  =  1  the  nondegeneracy  condition  guarantees  Liapunov 
stability  of  the  fixed  point  of  the  mapping.  We  note  that  in  this  case  the  con¬ 
dition  of  absence  of  lower  resonance  has  the  form 

A3  /  1  A4  /  1. 

Thus  a  fixed  point  of  an  area-preserving  mapping  of  the  plane  to  itself  is 
Liapunov  stable  if  the  linear  part  of  the  mapping  is  rotation  through  an  angle 
which  is  not  a  multiple  of  90°  or  120°  and  if  the  coefficient  a>k  x  in  the  normal 
Birkhoflf  form  is  not  zero  (guaranteeing  nontrivial  dependence  of  the  angle 
of  rotation  on  the  radius). 

We  have  not  gone  into  the  smoothness  conditions  assumed  in  these 
theorems.  The  minimal  smoothness  needed  is  not  known  in  even  one  case. 
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For  example,  we  point  out  that  the  last  assertion  about  stability  of  fixed 
points  of  a  mapping  of  the  plane  to  itself  was  first  proved  by  J.  Moser  under 
the  assumption  of  333-times  differentiability,  and  only  later  (by  Moser  and 
Riissman)  was  the  number  of  derivatives  reduced  to  6. 

E  Applications  of  the  theorem  on  invariant  tori 

and  its  generalizations 

There  are  many  mechanical  problems  to  which  we  can  apply  the  theorem 
formulated  above.  One  of  the  simplest  of  these  problems  is  the  motion  of  a 
pendulum  under  the  action  of  a  periodically  changing  exterior  field  or  under 
the  action  of  vertical  oscillations  of  the  point  of  suspension. 

It  is  well  known  that,  in  the  absence  of  parametric  resonance,  the  lower 
equilibrium  position  of  a  pendulum  is  stable  in  the  linear  approximation.  The 
stability  of  this  position  with  regard  to  nonlinear  effects  (under  the  further 
assumption  of  the  absence  of  resonances  of  order  3  and  4)  can  be  proved 
only  with  the  help  of  the  theorem  on  invariant  tori. 

In  an  analogous  way  we  can  use  the  theorem  on  invariant  tori  to  investigate 
conditionally-periodic  motions  of  a  system  of  interacting  nonlinear  os¬ 
cillators. 

Another  example  is  the  geodesic  flow  on  a  convex  surface  close  to  an 
ellipsoid.  There  are  two  degrees  of  freedom  in  this  system,  and  we  can  show 
that  most  geodesics  on  a  three-dimensional  near-ellipsoidal  surface  oscillate 
between  two  “caustics”  close  to  the  lines  of  curvature  of  the  surface,  densely 
filling  out  the  ring  between  them.  At  the  same  time,  we  can  arrive  at  theorems 
on  the  stability  of  the  two  closed  geodesics  obtained,  after  deforming  the 
surface,  from  the  two  ellipses  containing  the  middle  axis  of  the  ellipsoid  (in 
the  absence  of  resonances  of  orders  3  and  4). 

As  one  more  example,  we  can  look  at  closed  trajectories  on  a  billiard  table 
of  any  convex  shape.  Among  the  closed  billiard  trajectories  are  those  which 
are  stable  in  the  linear  approximation,  and  we  can  conclude  that  in  the 
general  case  they  are  actually  stable.  An  example  of  such  a  stable  billiard 
trajectory  is  the  minor  axis  of  an  ellipse;  therefore,  a  closed  billiard  trajec¬ 
tory,  close  to  the  minor  axis  of  an  ellipse  on  a  billiard  table  which  is  almost 
the  ellipse,  is  stable. 

Application  of  the  theorem  on  invariant  tori  to  the  problem  of  rotations 
of  an  asymmetric  heavy  rigid  body  allows  us  to  consider  the  nonintegrable 
case  of  a  rapidly  rotating  body.  The  problem  of  rapid  rotation  is  mathe¬ 
matically  equivalent  to  the  problem  of  motion  with  moderate  velocity  in  a 
weak  gravitational  field :  the  essential  parameter  is  the  ratio  of  potential  to 
kinetic  energy.  If  this  parameter  is  small,  then  we  can  use  eulerian  motion  of 
a  rigid  body  as  a  first  approximation. 

By  applying  the  theorem  on  invariant  tori  to  the  problem  with  two  degrees 
of  freedom  obtained  after  eliminating  cyclic  coordinates  (rotations  around 
the  vertical)  we  come  to  the  following  conclusion  about  the  motion  of  a 
rapidly  rotating  body :  if  the  kinetic  energy  of  rotation  of  a  body  is  sufficiently 
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large  in  comparison  with  the  potential  energy,  then  the  length  of  the  vector  of 
angular  momentum  and  its  angle  with  the  horizontal  remain  forever  close 
to  their  initial  values. 

It  follows  from  this  that  the  motion  of  the  body  will  forever  be  close  to  a 
combination  of  Euler-Poinsot  motion  and  azimuthal  procession,  except  in 
the  case  when  the  initial  values  of  kinetic  energy  and  total  momentum  are 
close  to  those  for  which  the  body  can  rotate  around  the  middle  principal  axis. 
In  this  last  case,  realized  only  for  special  initial  conditions,  the  splitting  of 
separatrices  near  the  middle  axis  implies  a  more  complicated  undulation 
about  the  middle  axis  than  in  Euler-Poinsot  motion. 

One  generalization  of  the  theorem  on  invariant  tori  leads  to  the  theorem 
on  the  adiabatic  invariance  for  all  time  of  the  action  variable  in  a  one¬ 
dimensional  oscillating  system  with  periodically  changing  parameters.  Here 
we  must  assume  that  the  rule  for  changing  parameters  is  given  by  a  fixed 
smooth  periodic  function  of  “slow  time,”  and  the  small  parameter  of  the 
problem  is  the  ratio  of  the  period  of  characteristic  oscillations  and  the  period 
of  change  of  parameters.  Then,  if  the  period  of  change  of  parameters  is  suffi¬ 
ciently  large,  the  change  in  the  adiabatic  invariant  of  a  phase  point  remains 
small  in  the  course  of  an  infinite  interval  of  time. 

In  an  analogous  way  we  can  prove  the  adiabatic  invariance  for  all  time 
of  the  action  variable  in  the  problem  of  a  charged  particle  in  an  axially- 
symmetric  magnetic  field.  Violation  of  axial  symmetry  in  this  problem  in¬ 
creases  the  number  of  degrees  of  freedom  from  two  to  three,  so  that  the 
invariant  tori  cease  to  divide  the  energy  level  manifolds,  and  the  phase  curve 
wanders  about  the  resonance  zones. 

Finally,  applying  the  theory  to  the  three-  (or  many-)  body  problem,  we 
succeed  in  finding  conditionally-periodic  motions  of  “planetary  type.”  To 
describe  these  motions,  we  must  say  a  few  words  about  the  next  approxima¬ 
tion  after  the  keplerian  one  in  the  problem  of  the  motion  of  the  planets.  For 
simplicity  we  will  limit  ourselves  to  the  planar  problem. 

For  each  keplerian  ellipse,  consider  the  vector  connecting  the  focus  of  the 
ellipse  (i.e.,  the  sun)  to  the  center  of  the  ellipse.  This  vector,  called  the  Laplace 
vector,  characterizes  both  the  magnitude  of  the  eccentricity  of  the  orbit  and  the 
direction  to  the  perihelion. 

The  interaction  of  the  planets  on  one  another  causes  the  keplerian 
ellipse  (and  therefore  the  Laplace  vector)  to  change  slowly.  In  addition,  there 
is  an  important  difference  between  changes  in  the  major  semi-axis  and 
changes  in  the  Laplace  vector.  Namely,  the  major  semi-axis  has  no  secular 
perturbations,  i.e.,  in  the  first  approximation  it  merely  oscillates  slightly 
around  its  average  value  (“Laplace’s  theorem”).  The  Laplace  vector,  on  the 
other  hand,  performs  both  periodic  oscillations  and  secular  motion.  The 
secular  motion  may  be  obtained  if  we  spread  each  planet  over  its  orbit 
proportionally  to  the  time  spent  in  travelling  each  piece  of  the  orbit,  and 
replace  the  attraction  of  the  planets  by  the  attraction  of  the  rings  obtained, 
that  is,  if  we  average  the  perturbation  over  the  rapid  motions.  The  true 
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motion  of  the  Laplace  vector  is  obtained  from  the  secular  one  by  the  addi¬ 
tion  of  small  oscillations ;  these  oscillations  are  essential  if  we  are  interested 
in  small  intervals  of  time  (years),  but  their  effect  remains  small  in  comparison 
to  the  effect  of  the  secular  motion  if  we  consider  a  large  interval  of  time 
(thousands  of  years). 

Calculations  (carried  out  by  Lagrange)  show  that  the  secular  motion  of 
the  Laplace  vector  of  each  of  n  planets  moving  in  one  plane  consists  of  the 
following  (if  we  ignore  the  squares  of  the  eccentricities  of  the  orbits  which 
are  small  in  comparison  with  the  eccentricities  themselves).  In  the  orbital 
plane  of  a  planet  we  must  arrange  n  vectors  of  fixed  lengths,  each  rotating 
uniformly  with  its  angular  velocity.  The  Laplace  vector  is  their  sum. 

This  description  of  the  motion  of  the  Laplace  vector  is  obtained  because 
the  hamiltonian  system  averaged  with  respect  to  rapid  motions,  which 
describes  the  secular  motion  of  the  Laplace  vector,  has  an  equilibrium  posi¬ 
tion  corresponding  to  zero  eccentricities.  The  described  motion  of  the  Lap¬ 
lace  vector  is  the  decomposition  of  small  oscillations  near  this  equilibrium 
position  into  characteristic  oscillations.  The  angular  velocities  of  the  uni¬ 
formly  rotating  components  of  the  Laplace  vector  are  the  characteristic 
frequencies,  and  the  lengths  of  these  components  determine  the  amplitudes 
of  the  characteristic  oscillations. 

We  note  that  the  motion  of  the  Laplace  vector  of  the  earth  is,  apparently,  one  of  the  factors 
involved  in  the  occurrence  of  ice  ages.  The  reason  is  that,  when  the  eccentricity  of  the  earth’s 
orbit  increases,  the  time  it  spends  near  the  sun  decreases,  while  the  time  it  spends  far  from  the 
sun  increases  (by  the  law  of  areas);  thus  the  climate  becomes  more  severe  as  the  eccentricity 
increases.  The  magnitude  of  this  effect  is  such  that,  for  example,  the  amount  of  solar  energy 
received  in  a  year  at  the  latitude  of  Leningrad  (60°N)  may  attain  the  value  which  now  corresponds 
to  the  latitudes  of  Kiev  (50°N)  (for  decreased  eccentricity)  and  Taimir  (80°N)  (for  increased 
eccentricity).  The  characteristic  time  of  variation  of  the  eccentricity  (tens  of  thousands  of  years) 
agrees  well  with  the  interval  between  ice  ages. 

The  theorems  on  invariant  tori  lead  to  the  conclusion  that  for  planets  of 
sufficiently  small  mass,  there  is,  in  the  phase  space  of  the  problem,  a  set  of 
positive  measure  filled  with  conditionally-periodic  phase  curves  such  that 
the  corresponding  motion  of  the  planets  is  nearly  motion  over  slowly 
changing  ellipses  of  small  eccentricities,  and  the  motion  of  the  Laplace 
vectors  is  almost  that  given  by  the  approximation  described  above.  Further¬ 
more,  if  the  masses  of  the  planets  are  sufficiently  small,  then  motions  of  this 
type  fill  up  most  of  the  region  of  phase  space  corresponding  in  the  keplerian 
approximation  to  motions  of  the  planets  in  the  same  direction  over  non¬ 
intersecting  ellipses  of  small  eccentricities. 

The  number  of  degrees  of  freedom  in  the  planar  problem  with  n  planets 
is  equal  to  2 n  if  we  take  the  sun  to  be  fixed.  The  integral  of  angular  momentum 
allows  us  to  eliminate  one  cyclic  coordinate;  however,  there  are  still  too 
many  variables  for  the  invariant  tori  to  divide  an  energy  level  manifold  (even 
if  there  are  only  two  planets  this  manifold  is  five-dimensional,  and  the  tori 
are  three-dimensional).  Therefore,  in  this  problem  we  cannot  draw  any  con- 
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elusions  about  the  preservation  of  the  large  semi-axes  over  an  infinite  interval 
of  time  for  all  initial  conditions,  but  only  for  most  initial  conditions. 

A  problem  with  two  degrees  of  freedom  is  obtained  by  further  idealization. 
We  replace  one  of  the  two  planets  by  an  “asteroid”  which  moves  in  the  field 
of  the  second  planet  (“Jupiter”),  not  perturbing  its  motion. 

The  problem  of  the  motion  of  such  an  asteroid  is  called  the  restricted 
three-body  problem.  The  planar  restricted  three-body  problem  reduces  to  a 
system  with  two  degrees  of  freedom,  periodically  depending  on  time,  for  the 
motion  of  the  asteroid.  If,  in  addition,  the  orbit  of  Jupiter  is  circular,  then  in  a 
coordinate  system  rotating  together  with  it  we  obtain,  for  the  motion  of  the 
asteroid,  an  autonomous  hamiltonian  system  with  two  degrees  of  freedom — 
called  the  planar  restricted  circular  three-body  problem. 

In  this  problem,  there  is  a  small  parameter — the  ratio  of  the  masses  of 
Jupiter  and  the  sun.  The  zero  value  of  the  parameter  corresponds  to  un¬ 
perturbed  keplerian  motion  of  the  asteroid,  represented  in  our  four-dimen¬ 
sional  phase  space  as  a  conditionally-periodic  motion  on  a  two-dimensional 
torus  (since  the  coordinate  system  is  rotating).  One  of  the  frequencies  of  this 
conditionally-periodic  motion  is  equal  to  1  for  all  initial  conditions;  this  is 
the  angular  velocity  of  the  rotating  coordinate  system,  i.e.,  the  frequency  of 
the  revolution  of  Jupiter  around  the  sun.  The  second  frequency  depends  on 
the  initial  conditions  (this  is  the  frequency  of  the  revolution  of  the  asteroid 
around  the  sun)  and  is  fixed  on  any  fixed  three-dimensional  level  manifold 
of  the  hamiltonian  function. 

Therefore,  the  nondegeneracy  condition  is  not  fulfilled  in  our  problem,  but 
the  condition  for  isoenergetic  nondegeneracy  is  fulfilled.  Kolmogorov’s 
theorem  applies,  and  we  conclude  that  most  invariant  tori  with  irrational 
ratios  of  frequencies  are  preserved  in  the  case  when  the  mass  of  the  perturbing 
planet  (Jupiter)  is  not  zero,  but  sufficiently  small. 

Furthermore,  the  two-dimensional  invariant  tori  divide  the  three- 
dimensional  level  manifolds  of  the  hamiltonian  function.  Therefore,  the 
magnitude  of  the  major  semi-axis  and  the  eccentricity  of  the  keplerian 
ellipse  of  the  asteroid  will  remain  forever  near  their  initial  values  if,  at  the 
initial  moment,  the  keplerian  ellipse  does  not  intersect  the  orbit  of  the 
perturbing  planet,  and  if  the  mass  of  this  planet  is  sufficiently  small. 

In  addition,  in  a  stationary  coordinate  system,  the  keplerian  ellipse  of  the 
asteroid  could  slowly  rotate,  since  our  system  is  only  isoenergetically  non¬ 
degenerate.  Therefore  under  perturbations  of  an  invariant  torus  frequencies 
are  not  preserved,  but  only  their  ratios.  As  a  result  of  a  perturbation,  the 
frequency  of  azimuthal  motion  of  the  perihelion  of  the  asteroid  in  a  stationary 
coordinate  system  could  be  slightly  different  from  Jupiter’s  frequency,  and 
then  in  the  stationary  system  the  perihelion  would  slowly  rotate. 
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In  his  study  of  periodic  solutions  of  problems  in  celestial  mechanics,  H. 
Poincare  constructed  a  very  simple  mode!  which  contains  the  basic  difficulties 
of  the  problem.  This  model  is  an  area-preserving  mapping  of  the  planar 
circular  annulus  to  itself.  Mappings  of  this  form  arise  in  the  study  of  dynam¬ 
ical  systems  with  two  degrees  of  freedom.  In  fact,  a  mapping  of  a  two- 
dimensional  surface  of  section  to  itself  is  defined  as  follows :  each  point  p  of 
the  surface  of  section  is  taken  to  the  next  point  at  which  the  phase  curve 
originating  at  p  intersects  the  surface  (cf.  Appendix  7).  Thus,  a  closed  phase 
curve  corresponds  to  a  fixed  point  of  the  mapping  or  of  a  power  of  the 
mapping.  Conversely,  every  fixed  point  of  the  mapping  or  of  a  power  of 
the  mapping  determines  a  closed  phase  curve. 

In  this  way,  a  question  about  the  existence  of  periodic  solutions  of  prob¬ 
lems  in  dynamics  is  reduced  to  a  question  about  fixed  points  of  area-pre¬ 
serving  mappings  of  the  annulus  to  itself.  In  studying  such  mappings, 
Poincare  arrived  at  the  following  theorem. 

A  Fixed  points  of  mappings  of  the  annulus  to  itself 

Theorem.  Suppose  that  we  are  given  an  area-preserving  homeomorphic  mapping 
of  the  planar  circular  annulus  to  itself.  Assume  that  the  boundary  circles  of 
the  annulus  are  turned  in  different  directions  under  the  mapping.  Then  this 
mapping  has  at  least  two  fixed  points. 


The  condition  that  the  boundary  circles  are  turned  in  different  directions 
means  that,  if  we  choose  coordinates  (x,  y  mod  In)  on  the  annulus  so  that  the 
boundary  circles  are  x  =  a  and  x  =  b,  then  the  mapping  is  defined  by  the 
formula 


(x,  y)  -+  (/(x,  y),  y  +  g(x,  y)), 

where  the  functions  /  and  g  are  continuous  and  27r-periodic  in  y,  with 
f (a,  y)  =  a,  f(b,  y)  =  b,  and  g(a,  y)  <  0,  g(b,  y)  >  0  for  all  y. 

The  proof  of  this  theorem,  announced  by  Poincare  not  long  before  his 
death,  was  given  only  later  by  G.  D.  BirkhofF  (cf.  his  book,  Dynamical 
Systems,  Amer.  Math.  Soc.,  1927). 

There  remain  many  open  questions  related  to  this  theorem ;  in  particular, 
attempts  to  generalize  it  to  higher  dimensions  are  important  for  the  study 
of  periodic  solutions  of  problems  with  many  degrees  of  freedom.  The  argu¬ 
ment  Poincare  used  to  arrive  at  his  theorem  applies  to  a  whole  series  of  other 
problems.  However,  the  intricate  proof  given  by  Birkhoff  does  not  lend  itself 
to  generalization.  Therefore,  it  is  not  known  whether  the  conclusions  sug¬ 
gested  by  Poincare’s  argument  are  true  beyond  the  limits  of  the  theorem  on 
the  two-dimensional  annulus.  The  argument  in  question  is  the  following. 
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B  The  connection  between  fixed  points  of  a 
mapping  and  critical  points  of  the  generating 
function 

We  will  define  a  symplectic  diffeomorphism  of  the  annulus 

(x,  y)  -  (X,  Y ) 

with  the  help  of  the  generating  function  Xy  +  S(X,  y),  where  the  function  S 
is  27t-periodic  in  y.  For  this  to  be  a  diffeomorphism  we  need  that  dX/dx  ^  0. 
Then 


dS  —  (x  —  X)dy  +  (T  —  y)dX , 

and,  therefore,  the  fixed  points  of  the  diffeomorphism  are  critical  points  of 
the  function  F(x,  y)  =  S(X(x,  y),  y).  This  function  F  can  always  be  constructed 
by  defining  it  as  the  integral  of  the  form  (x  —  X)dy  +  (7  —  y)dX.  The 
gradient  of  this  function  is  directed  either  inside  the  annulus  or  outside  on 
both  boundary  circles  at  once  (by  the  condition  on  rotation  in  different 
directions). 

But  every  smooth  function  on  the  annulus  whose  gradient  on  both  bound¬ 
ary  circles  is  directed  inside  the  annulus  (or  out  from  it)  has  a  critical  point 
(maximum  or  minimum)  inside  the  annulus.  Furthermore,  it  can  be  shown 
that  the  number  of  critical  points  of  such  a  function  on  the  annulus  is  at  least 
two.  Therefore,  we  could  assert  that  our  diffeomorphism  has  at  least  two 
critical  points  if  we  were  sure  that  every  critical  point  of  F  is  a  fixed  point  of 
the  mapping. 

Unfortunately,  this  is  true  only  under  the  condition  that  dX/dx  #  0,  so 
that  we  can  express  F  in  terms  of  X  and  y.  Thus  our  argument  is  valid 
for  mappings  which  are  not  too  different  from  the  identity.  For  example,  it  is 
sufficient  that  the  derivatives  of  the  generating  function  S  be  less  than  1. 

A  refinement  of  this  argument  (with  a  different  choice  of  generating 
function1 1 J)  shows  that  it  is  even  sufficient  that  the  eigenvalues  of  the  Jacobi 
matrix  D(X ,  Y)/D(x,  y)  never  be  equal  to  -- 1  at  any  point,  i.e.,  that  our 
mapping  never  flips  the  tangent  space  at  any  point.  Unfortunately,  all  such 
conditions  are  violated  at  some  points  for  mappings  far  from  the  identity. 
The  proof  of  Poincare’s  theorem  in  the  general  case  uses  entirely  different 
arguments. 

The  connection  between  fixed  points  of  mappings  and  critical  points  of 
generating  functions  seems  to  be  a  deeper  fact  than  the  theorem  on  mappings 
of  a  two-dimensional  annulus  into  itself.  Below,  we  give  several  examples  in 
which  this  connection  leads  to  meaningful  conclusions  which  are  true  under 
some  restrictions  whose  necessity  is  not  obvious. 

X  -  x  Y  -  y 
dX  +  dx  dY  +  dy 


i  i  i 

d<t>  =  \ 
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C  Symplectic  diffeomorphisms  of  the  torus 

Consider  a  symplectic  diffeomorphism  of  the  torus  which  fixes  the  center  of 
gravity 

( x ,  y)  -*•  (x  +  fix,  y),  y  +  g(x,  y))  =  (X,  Y), 

where  x  and  y  mod  2n  are  angular  coordinates  on  the  torus,  “symplectic 
means  the  Jacobian  D(X,  Y)/D(x ,  y)  is  equal  to  1,  and  the  condition  on 
preserving  the  center  of  gravity  means  that  the  average  values  of  the  functions 
/  and  g  are  equal  to  zero. 

Theorem.  Such  a  diffeomorphism  has  at  least  four  fixed  points,  counting 

multiplicity,  and  at  least  three  geometrically  different  ones,  at  least  under  the 

assumption  that  the  eigenvalues  of  the  Jacobi  matrix  are  not  equal  to  -l  at 

any  point. 

The  proof  is  based  on  consideration  of  the  function  on  the  torus  given  by 
the  formula 

<I>(x,  y)  =  i  j(X  -  xfidY  +  dy)  —  (Y  —  y)(dX  +  dx), 

and  on  the  fact  that  a  smooth  function  on  the  torus  has  at  least  four  critical 
points  (counting  multiplicity)  of  which  at  least  three  are  geometrically 
different. 

Attempts  at  proving  this  theorem  without  restrictions  on  the  eigenvalues 
meet  with  difficulties  very  similar  to  those  encountered  by  Poincare  in  the 
theorem  about  the  annulus. 

We  note  that  the  theorem  about  the  annulus  would  follow  from  the  theorem  about  the  torus 
if  in  the  latter  we  could  throw  out  the  condition  on  the  eigenvalues.  In  fact,  we  can  put  together 
a  torus  from  two  copies  of  our  annulus,  inserting  a  narrow  connecting  annulus  along  each  of 
the  two  boundary  circles. 

Then  we  can  extend  our  mapping  of  the  annulus  to  a  symplectic  diffeomorphism  of  the 
torus  such  that:  (1)  on  each  of  the  two  large  annuli  the  diffeomorphism  coincides  with  the 
original,  (2)  on  each  of  the  connecting  annuli  the  diffeomorphism  has  no  fixed  points,  and  (3) 
the  center  of  gravity  remains  fixed. 

The  construction  of  such  a  diffeomorphism  of  the  torus  uses  the  property  that  the  boundary 
circles  rotate  in  different  directions.  On  each  connecting  annulus  all  points  are  translated  in  the 
same  direction  as  on  both  circles  bounding  the  connecting  annulus.  Since  the  translations  on 
the  connecting  annuli  are  in  opposite  directions,  the  size  of  the  translations  can  be  chosen  to 
ensure  preservation  of  the  center  of  gravity. 

Now  out  of  four  fixed  points  on  the  torus,  two  must  lie  in  the  original  annulus,  and  we  obtain 
the  theorem  on  annuli  from  the  theorem  on  tori. 

The  theorem  on  tori  formulated  above  can  be  generalized  to  other 
symplectic  manifolds,  both  two-dimensional  and  many-dimensional.  To 
formulate  these  generalizations,  we  must  first  reformulate  the  condition  of 
preservation  of  the  center  of  gravity. 
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Let  g  :  M  M  be  a  symplectic  diffeomorphism.  We  say  that  g  is  homolo¬ 
gous  to  the  identity  if  it  can  be  connected  to  the  identity  diffeomorphism 
by  a  smooth  curve  gt  consisting  of  symplectic  diffeomorphisms  such  that 
the  field  of  velocities  g,  at  each  moment  of  time  t  has  a  single-valued  hamil- 
tonian  function.  It  can  be  shown  that  the  symplectic  diffeomorphisms  homo¬ 
logous  to  the  identity  form  the  commutator  subgroup  of  the  connected 
component  of  the  identity  in  the  group  of  all  symplectic  diffeomorphisms  of 
the  manifold. 

In  the  case  when  our  manifold  is  the  two-dimensional  torus,  the  sym¬ 
plectic  diffeomorphisms  homologous  to  the  identity  are  exactly  those  which 
preserve  the  center  of  gravity. 

Thus  we  come  to  the  following  generalization  of  Poincare’s  theorem. 

Theorem.  Every  symplectic  diffeomorphism  of  a  compact  symplectic  manifold, 
homologous  to  the  identity,  has  at  least  as  many  fixed  points  as  a  smooth 
function  on  this  manifold  has  critical  points  (at  least  if  this  diffeomorphism 
is  not  too  far  from  the  identity ). 1 1 2 

We  note  that  the  condition  of  the  mapping  being  homologous  to  the 
identity  is  essential,  as  we  see  already  from  the  example  of  a  translation  on 
the  torus,  which  has  no  fixed  points  at  all. 

As  to  the  last  restriction  (that  the  diffeomorphism  be  not  too  far  from  the 
identity),  it  is  not  clear  whether  it  is  essential.1123  Tn  the  case  that  our  manifold 
is  the  two-dimensional  torus,  it  is  sufficient  that  none  of  the  eigenvalues  of  the 
Jacobi  matrix  of  the  diffeomorphism  (in  any  global  symplectic  coordinate 
system  on  K2")  be  equal  to  minus  one. 

A  restriction  of  this  sort  may  be  necessary  in  higher-dimensional  problems.  It  is  not  im¬ 
possible  that  Poincare’s  theorem  is  due  to  an  essentially  two-dimensional  effect,  as  is  the 
following  theorem  of  A.  I.  Smrel’man  and  N.  A.  Nikishin :  Every  area-preserving  diffeomorphism 
of  the  two-dimensional  sphere  to  itself  has  at  least  two  geometrically  different  fixed  points. 

The  proof  of  this  theorem  is  based  on  the  fact  that  the  index  of  the  gradient  vector  field 
of  a  smooth  function  of  two  variables  at  an  isolated  critical  point  cannot  be  greater  than  1 
(although  it  can  be  equal  to  1,0,  —  1,  —  2,  —  3, . . .),  and  the  sum  of  the  indices  of  all  the  fixed 
points  of  an  orientation-preserving  diffeomorphism  of  the  two-dimensional  sphere  to  itself 
is  equal  to  2.  On  the  other  hand,  the  index  of  the  gradient  of  a  smooth  function  of  a  large  number 
of  variables  at  a  critical  point  can  take  any  integer  value. 


D  Intersections  of  lagrangian  manifolds 

Poincare’s  argument  can  be  given  a  slightly  different  form  if  on  every 
radius  of  the  annulus  we  consider  the  points  shifted  only  radially.  There  are 
such  points  on  every  radius,  since  the  boundary  circles  of  the  annulus  turn 

112  [For  a  proof,  see  V.  Arnold,  Sur  les  proprietes  topologiques  des  applications  globalement 
canoniques  de  la  mecanique  classique,  C.  R.  Acad.  Sci.  Paris,  1965  and  A.  Weinstein,  Symplectic 
manifolds  and  their  lagrangian  submanifolds,  Advances  in  Math.  6  (1971)  329-346.] 

112a  [Recently,  Conley  and  Zehnder,  followed  by  others,  have  proved  the  theorem  for  tori, 
surfaces,  and  other  manifolds,  without  the  restriction  of  closeness  to  the  identity.] 
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in  different  directions.  Assume  that  we  can  make  a  smooth  curve  of  radially 
shifting  points,  separating  the  interior  and  exterior  circles  of  the  annulus. 
Then  the  image  of  this  curve  under  our  mapping  must  intersect  the  curve 
(since  the  regions  into  which  the  curve  divides  the  annulus  are  carried  to 
regions  of  equal  area). 

If  this  curve  and  its  image  each  intersect  each  radius  once,  then  the  points 
of  intersection  of  the  curve  with  its  image  are  obviously  fixed  points  of  the 
mapping. 

Part  of  this  argument  can  be  carried  out  in  higher  dimensions,  and  this 
gives  useful  results  about  periodic  solutions  of  problems  in  dynamics.  The 
role  of  the  annulus  in  the  many-dimensional  case  is  played  by  the  phase 
space:  the  direct  product  of  a  region  in  euclidean  space  with  a  torus  of  the 
same  dimension  (the  annulus  is  the  product  of  an  interval  with  the  circle). 
A  symplectic  structure  on  the  phase  space  is  defined  in  the  usual  way,  i.e.,  it  has 
the  form  Q  =  £  dxk  a  dyk ,  where  the  xk  are  action  variables  and  yk  are  angle 
variables. 

It  is  not  difficult  to  explain  which  symplectic  diffeomorphisms  of  our 
phase  space  are  homologous  to  the  identity.  Namely,  a  symplectic  diffeo- 
morphism  A  is  homologous  to  the  identity  if  it  can  be  obtained  from  the 
identity  by  a  continuous  deformation  and  if 


for  any  closed  contour  y  (not  necessarily  homologous  to  zero).  The  condition 
that  the  transformation  be  homologous  to  the  identity  prohibits  systematic 
shifts  along  the  x-direction  (“evolution  of  the  action  variables”),  but  permits 
shifts  along  the  tori. 

We  consider  one  of  the  ^-dimensional  tori  x  =  c  =  const  and  apply  to 
it  our  symplectic  diflfeomorphism  homologous  to  the  identity.  It  turns  out 
that  the  original  torus  intersects  its  image  in  at  least  2"  points  (counting 
multiplicities),  of  which  at  least  n  +  1  are  geometrically  different,  at  least 
under  the  assumption  that  the  image  torus  has  an  equation  of  the  form 
x  =  / (y),  where  /  is  smooth. 

For  n  =  1,  this  assertion  means  that  each  of  the  concentric  circles  con¬ 
stituting  the  annulus  intersects  its  image  in  at  least  two  points.  This  also 
follows  from  the  preservation  of  area,  so  that  the  assumption  that  the  image 
has  equation  x  =  f(y)  is  not  necessary. 

Whether  or  not  this  assumption  is  necessary  in  higher  dimensions  is  not 
known.  If  we  make  this  assumption,  the  proof  proceeds  in  the  following  way. 

We  note  that  the  original  torus,  is  a  lagrangian  submanifold  of  phase 
space.  Our  diffeomorphism  is  symplectic,  so  the  image  torus  is  also  lagrang¬ 
ian.  Therefore,  the  1-form  (x  -  c)dy  on  it  is  closed.  Furthermore,  this  form 
on  the  torus  is  the  total  differential  of  some  single-valued  smooth  function  F, 
since  our  diffeomorphism  is  homologous  to  the  identity,  and  therefore  for 


L 
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any  closed  contour  y  we  have 


c)dy  =  (b  x  dy  —  (b  c  dy  =  (b  x  dy  —  (D  c  dy 

J  Ay  J  Ay  *  y  *  Ay 

=  c  (b  dy  —  c  CD  dy  =  0. 

J  y  J  Ay 


We  note  that  points  of  intersection  of  the  torus  with  its  image  are  critical 
points  of  the  function  F  (since  at  them  dF  =  (x  —  c)dy  =  0). 

From  the  condition  of  single-valued  projection  of  the  image  torus  (i.e., 
from  the  fact  that  the  image  torus  has  equation  x  =  /(y))  it  follows  that, 
conversely,  all  critical  points  of  the  function  F  are  points  of  intersection  of 
our  tori.  In  fact,  under  these  conditions  y  can  be  taken  for  local  coordinates 
on  the  torus,  and  therefore  the  fact  that  dF  is  zero  for  all  vectors  tangent  to 
the  image  torus  implies  x  =  c. 

A  smooth  function  on  an  n-dimensional  torus  has  at  least  2"  critical  points, 
counting  multiplicities,  of  which  at  least  n  +  1  are  geometrically  different 
(cf.,  for  example,  Milnor,  Morse  Theory,  Princeton  University  Press,  1967). 

Therefore,  our  tori  intersect  in  at  least  2”  points  (counting  multiplicities), 
and  there  are  at  least  n  +  1  geometrically  different  points  of  intersection. 

Exactly  the  same  argument  shows  that  any  lagrangian  torus  intersects 
its  image  in  at  least  2”  points  (of  which  at  least  n  +  1  are  geometrically 
different),  under  the  assumption  that  both  the  original  torus  and  its  image 
project  single-valued  onto  the  y-space,  i.e.,  are  given  by  equations  y  —  f  (x) 
and  x  =  g(y),  respectively.  Besides,  this  statement  reduces  to  the  previous 
one  by  the  canonical  transformation  (x,  y)  -*  (x  —  / (y),  y). 


E  Applications  to  determining  fixed  points  and  periodic  solutions 

We  now  consider  a  symplectic  transformation,  homologous  to  the  identity, 
of  the  special  form  which  arises  in  integrable  problems  in  dynamics,  i.e.,  of 
the  form 

A0(x,  y)  =  (x,  y  +  w(x)),  where  a>  =  — . 

ox 


Here  x  e  IR"  is  the  action  variable  and  y  mod  2n  e  Tn  is  the  angular  coordin¬ 
ate. 

We  assume  that  on  the  torus  x  =  x0  all  the  frequencies  are  commensur¬ 
able: 

Ic- 

cu,-(x0)  —  ~^2n  with  integers  kt,  N;  a>(x0)  #  0, 


and  that  the  nondegeneracy  condition 

^0 

x0 

is  satisfied. 


det 


da> 

dx 
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Theorem.  Every  symplectic  diffeomorphism  A  homologous  to  the  identity  and 
sufficiently  close  to  A  0  has,  near  the  torus  x  =  x0,  at  least  T  periodic 
points  £  of  period  N  ( such  that  Anq  =  £),  counting  multiplicity. 


The  proof  could  be  reduced  to  investigating  the  intersection  of  two  lagrangian  submanifolds 
of  a  4n-dimensional  space  (Rn  x  T"  x  IR"  x  Tn)  with  fi  =  dx  a  dy  -  dX  a  dY,  one  of  which 
is  the  diagonal  ( X  =  x,  Y  =  y)  and  the  other  the  graph  of  the  mapping  ,4  \ 

However,  it  is  easier  to  directly  construct  a  suitable  function  on  the  torus.  In  fact,  the  map¬ 
ping  /Iq  has  the  form 


(x,  y)  -  (.v,  y  4-  *(x)). 


where  a{x0)  =  0,  det 


rat 


dx 


I -to 


* 0 . 


By  the  implicit  function  theorem,  the  mapping  /T'  has,  near  the  torus  x  =  x0.  a  torus  which  is 
displaced  only  radially  ((x,  y)  -*•  (X,  T))  and  is  given  by  an  equation  of  the  form  x  =  /(y): 
its  image  is  also  given  by  an  equation  x  =  g(y)  of  the  same  form.  In  this  notation,  A'(/(y),  y)  = 

00'),  r(/(y).  V)  =  y. 

Since  A  is  homologous  to  the  identity,  it  follows  that  A *  has  a  single-valued  global  generating 
function  of  the  form  .Yy  +  S(X,  y),  where  S  has  period  2n  in  the  variable  y. 

The  function  F(y)  =  SfA'f/fy),  y),  y)  has  at  least  2"  critical  points  yk  on  the  torus.  All  the 
points  Qk  =  (/(>•*),  yk)  are  fixed  points  for  A  \  In  fact, 


dF  =  (x  -  X)  dy  +(Y  -  y)  dX  =  (x  -  X)  dy  =  (f(y)  -  g(y))  dy. 


Therefore,  since  dF\yk  =  0,  it  follows  that  f(yk)  =  g(yk),  i.e.,  A*ck  =  ck,  as  was  to  be  shown. 


We  turn  now  to  closed  orbits  of  conservative  systems.  Using  the  term¬ 
inology  of  Appendix  8,  we  can  formulate  the  result  as  follows. 


Corollary.  Upon  disintegration  of  an  n-dimensional  torus,  entirely  filled  up  by 
dosed  trajectories  of  an  isoenergetically  nondegenerate  system,  at  least 
2n~ 1  closed  trajectories  of  the  perturbed  problem  are  formed  ( counting 
multiplicities ),  among  which  at  least  n  are  geometrically  distinct,  at  least 
if  the  perturbation  is  sufficiently  small. 


The  proof  is  reduced  to  the  preceding  theorem  with  the  help  of  a  (2 n  —  2)- 
dimensional  surface  of  section.  We  must  first  choose  angular  coordinates  y 
such  that  the  closed  trajectories  of  the  unperturbed  problem  on  the  torus 
are  given  by  the  equations  y2  =  ■  •  •  =  y„  =  0,  and  then  define  a  surface  of 
section  by  =  0. 

In  the  case  of  two  degrees  of  freedom  we  can  apply  Poincare’s  theorem  to 
the  annuli  formed  by  intersecting  invariant  tori  with  a  two-dimensional 
intersecting  surface.  We  obtain  the  following  result: 

In  the  gap  between  two  two-dimensional  invariant  tori  of  a  system  with 
two  degrees  of  freedom  there  are  always  at  least  two  closed  phase  trajectories, 
if  the  ratio  of  the  frequencies  of  conditionally-periodic  motions  on  these  tori 
are  different. 

In  this  way  we  obtain  many  periodic  solutions  in  all  problems  with  two 
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degrees  of  freedom,  where  invariant  tori  are  found  (for  example,  in  the  bound¬ 
ed  circular  three-body  problem,  in  the  problem  of  closed  geodesics,  etc.). 
There  is  even  a  conjecture  that  in  hamiltonian  systems  of  “general  form  ”  with 
compact  phase  spaces,  the  closed  phase  curves  form  a  dense  set.113  How¬ 
ever,  if  this  is  true,  the  closedness  of  most  of  these  curves  has  little  importance 
since  their  periods  are  extremely  large. 

As  an  example  of  applying  Poincare’s  methods  to  systems  with  more  than 
two  degrees  of  freedom,  we  have  a  theorem  of  Birkhoflf  about  the  existence  of 
infinitely  many  periodic  solutions  close  to  a  given  linearly  stable  periodic 
solution  of  general  form  (or  about  the  existence  of  infinitely  many  periodic 
points  in  a  neighborhood  of  a  fixed  point  of  a  linearly  stable  nondegenerate 
symplectic  mapping  of  a  space  to  itself).  In  the  proof,  the  mapping  is  first 
approximated  by  its  normal  form,  and  then  the  connection  between  fixed 
points  of  a  mapping  and  critical  points  of  the  generating  function  is  used. 

Knowing  periodic  solutions  allows  us,  among  other  things,  to  prove  the 
nonexistence  of first  integrals  (other  than  the  classical  ones)  in  many  problems 
in  dynamics.  Assume,  for  example,  that  on  some  level  manifold  of  known 
integrals  we  discover  a  periodic  trajectory  which  is  unstable.  Its  separatrices, 
in  general,  form  a  complicated  network,  which  we  considered  in  Appendix  7. 
If  this  phenomenon  of  splitting  of  separatrices  is  discovered,  and  if  we  can 
show  that  the  separatrices  are  not  contained  in  any  manifold  oflower  dimen¬ 
sion  than  the  level  manifold  we  are  considering,  then  we  can  be  sure  that  the 
system  has  no  new  first  integrals. 

The  complicated  behavior  of  phase  curves,  which  obstructs  the  existence 
of  first  integrals,  can  often  be  detected  without  the  help  of  periodic  solutions 
by  one  simple  glance  at  the  picture,  obtained  by  a  computer,  formed  by  the 
intersection  of  the  phase  curves  with  the  surface  of  section. 

F  Invariance  of  generating  functions 

We  have  already  noted  the  discouraging  noninvariance  of  generating 
functions  with  respect  to  the  choice  of  a  canonical  coordinate  system  on  a 
symplectic  manifold.  On  the  other  hand,  we  repeatedly  used  the  connection 
between  fixed  points  of  a  mapping  and  critical  points  of  the  generating 
function. 

It  turns  out  that,  although  generally  the  generating  function  is  not  in- 
variantly  associated  to  the  mapping,  near  a  fixed  point  there  is  an  invariant 
connection.  More  precisely,  suppose  we  are  given  a  symplectic  diffeo- 
morphism  fixing  some  point.  In  a  neighborhood  of  this  point,  we  define  a 
“generating  function” 

(j>  _  i  1*  y  ^  ~  xk  \  —  y* 

2  J'L\dXk  +  dxk  dYk+dyk 

113  A  proof  of  this  density  in  the  C1  topology  has  been  announced  by  C.  Pugh  and  C.  Robinson. 
[Editor’s  note] 
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with  the  help  of  some  symplectic  coordinate  system  (x,  y).114  Using  another 
symplectic  coordinate  system  (x',  y'),  we  construct  a  generating  function  O' 
in  the  same  way. 

Theorem.  If  the  linearization  of  the  symplectic  diffeomorphism  at  the  fixed 
point  has  no  eigenvalues  equal  to  -  1,  then  the  functions  O  and  O'  are  equiva¬ 
lent  in  a  neighborhood  of  the  fixed  point,  in  the  sense  that  there  is  a  diffeo¬ 
morphism  g  (in  general  not  symplectic )  such  that 

0(r)  =  O  '(g(z))  +  const. 

For  the  proof  see  the  article:  A.  Weinstein,  The  invariance  of  Poincare’s 
generating  function  for  canonical  transformations,  Inventiones  Mathe- 
maticae,  16,  No.  3  (1972),  202-214. 

It  should  be  noted  that  two  diffeomorphisms  with  generating  functions 
which  are  equivalent  in  a  neighborhood  of  a  fixed  point  are  not  necessarily 
equivalent  in  the  class  of  symplectic  diffeomorphisms  (for  example,  rotation 
and  rotation  through  an  angle  which  depends  on  the  radius,  with  non¬ 
degenerate  quadratic  parts  of  the  generating  function  at  zero). 

Since  the  first  edition  of  this  book  had  appeared  in  1974,  the  content  of 
this  Appendix  has  grown  into  a  new  branch  of  mathematics:  symplectic 
topology.  To  describe  this  development  (triggered  by  the  conjectures  in  this 
Appendix,  which  still  remain,  for  general  manifolds,  neither  proved,  nor 
disproved)  one  would  need  a  book  longer  than  the  present  one. 

The  interested  reader  might  follow  this  development  using  the  (incomplete) 
bibliography  on  pages  503-509. 


1 14  The  increase  of  this  function  along  any  arc  is  equal  to  the  integral  of  the  form  defining  the 
symplectic  structure  over  the  band  formed  by  the  rectilinear  intervals  connecting  each  f  )int 
with  its  image.  Therefore,  the  function  <P  is  associated  to  the  mapping  invariantly  with  respect 
to  linear  canonical  changes  of  coordinates. 
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Appendix  10:  Multiplicities  of  characteristic  frequencies, 
and  ellipsoids  depending  on  parameters 

Several  times  in  this  course  we  have  encountered  families  of  ellipsoids  in 
euclidean  space.  For  example,  in  studying  the  dependence  on  parameters  of 
characteristic  frequencies  of  small  oscillations,  we  encountered  equipotential 
surfaces  which  were  ellipsoids  in  euclidean  space,  depending  upon  the  degree 
of  rigidity  of  the  system,  (the  metric  of  the  space  was  defined  by  the  kinetic 
energy).  Another  example  was  the  ellipsoid  of  inertia  of  a  rigid  body  (the 
parameter  here  was  the  shape  of  the  rigid  body  and  its  distribution  of  mass). 

Here  we  will  consider  the  general  problem  of  describing  the  values  of  the 
parameter  for  which  the  spectrum  of  eigenvalues  degenerates,  i.e.,  the  cor¬ 
responding  ellipsoid  becomes  an  ellipsoid  of  revolution.  We  note  that  the 
eigenvalues  of  a  quadratic  form  on  euclidean  space  (or  the  lengths  of  the  axes 
of  an  ellipsoid)  change  continuously  under  continuous  changes  of  the 
parameters  of  a  system  (the  coefficients  of  the  form).  It  seems  natural  to 
expect  that  in  a  system  depending  on  one  parameter,  under  changes  of  the 
parameter,  at  certain  moments  one  of  the  eigenvalues  would  collide  with 
another,  so  that  for  these  values  of  the  parameter  the  system  would  have  a 
multiple  spectrum. 

Suppose,  for  example,  that  we  want  to  make  the  ellipsoid  of  inertia  of  a 
rigid  body  into  an  ellipsoid  of  revolution  by  movement  of  an  adjustable  mass 
along  an  arc  rigidly  attached  to  the  body  so  that  there  is  one  parameter  at 
our  disposal.  The  three  major  axes  a,  b ,  and  c  will  be  continuous  functions  of 
this  parameter,  and  at  first  glance  it  seems  that  for  a  suitable  value  of  the 
parameter  (p)  we  can  achieve  equality  of  two  of  the  axes,  say  a(p )  =  b(p).  It 
turns  out,  however,  that  this  is  not  so,  and  that  generally  we  need  to  attach 
at  least  two  adjustable  masses  to  make  the  ellipsoid  of  inertia  an  ellipsoid  of 
revolution. 

In  general,  a  multiple  spectrum  in  typical  families  of  quadratic  forms  is 
observed  only  for  two  or  more  parameters,  while  in  one-parameter  families 
of  general  form  the  spectrum  is  simple  for  all  values  of  the  parameter.  Under 
a  change  of  parameter  in  the  typical  one-parameter  family,  the  eigenvalues 
can  approach  closely,  but  when  they  are  sufficiently  close,  it  is  as  if  they 
begin  to  repel  one  another.  The  eigenvalues  again  diverge,  disappointing  the 
person  who  hoped,  by  changing  the  parameter,  to  achieve  a  multiple  spec¬ 
trum. 

In  this  appendix  we  consider  the  reasons  for  this  seemingly  strange  be¬ 
havior  of  the  eigenvalues,  and  we  discuss  briefly  analogous  questions  for 
systems  with  various  groups  of  symmetries. 

A  The  manifold  of  ellipsoids  of  revolution 

Consider  the  set  of  all  possible  quadratic  forms  on  the  n-dimensional  eucli¬ 
dean  space  IR".  This  set  has  itself  a  natural  structure  of  a  vector  space  of 
dimension  n(n  +  l)/2.  For  example,  the  quadratic  forms  on  the  plane  form  a 
three-dimensional  space  (a  form  Ax2  +  2 Bxy  +  Cy2  has  as  coordinates  the 
three  numbers  A,  B,  and  C ). 
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The  positive-definite  forms  form  an  open  region  in  this  space  of  all 
quadratic  forms  (for  example,  in  the  case  of  the  plane  this  is  the  inside  of  one 
nappe  of  the  cone  B2  =  AC  of  degenerate  forms). 

Every  ellipsoid  centered  at  the  origin  defines  a  positive-definite  quad¬ 
ratic  form,  for  which  it  is  the  level  set  of  1 ;  conversely,  the  set  of  level  1  of  any 
positive-definite  quadratic  form  is  an  ellipsoid.  We  can  therefore  identify  the 
sets  of  positive-definite  quadratic  forms  and  ellipsoids  centered  at  the  origin. 
In  this  way  we  give  the  set  of  ellipsoids  with  center  0  in  IR”  the  structure  of  a 
smooth  manifold  of  dimension  n(n  4-  l)/2  (this  manifold  is  covered  by  one 
chart:  a  region  in  the  space  of  quadratic  forms). 

Now  consider  the  set  of  all  ellipsoids  of  revolution.  We  claim  that  this  set 
has  codimension  2  in  the  space  under  consideration,  i.e.,  it  is  given  by  two 
independent  equations,  rather  than  one  as  it  would  seem  at  first  glance.  More 
precisely,  we  have 

Theorem  1.  The  set  of  ellipsoids  of  revolution  is  a  finite  union  of  smooth  sub¬ 
manifolds  of  codimension  2  and  higher  in  the  manifold  of  all  ellipsoids. 


The  codimension  of  a  manifold  is  the  difference  between  the  dimension 
of  the  ambient  space  and  the  dimension  of  the  submanifold. 

Proof.  We  first  consider  an  ellipsoid  in  n-dimensional  space  which  has  two 
equal  axes,  and  whose  other  axes  are  distinct.  Such  an  ellipsoid  is  defined  by 
the  directions  of  the  distinct  axes,  which  gives 


(n  —  1)  +  (n  —  2)  +  •  ■  •  +  2  = 


(n  +  l)(n  -  2) 

2 


different  parameters,  and  also  by  the  magnitudes  of  the  axes,  which  gives 
n  -  1  parameters.  Thus  the  total  number  of  parameters  is 

n2  —  n  —  2  +  2n  —  2 


which  is  two  less  than  the  dimension  of  the  space  of  all  ellipsoids  (which  is 
n(n  +  l)/2).  This  count  of  parameters  also  shows  that  the  set  of  ellipsoids 

with  exactly  two  equal  axes  is  a  manifold. 

As  for  ellipsoids  with  a  larger  number  of  equal  axes,  it  is  clear  that  they 
form  a  set  of  even  smaller  dimension.  A  rigorous  proof  follows  from  the 
following  lemma. 


Lemma.  The  set  of  all  ellipsoids  with  v2  double,  v3  triple,  v4  four-fold  axes,  etc. 
is  a  smooth  submanifold  of  the  manifold  of  all  ellipsoids,  with  codimension 

2v2  +  5v3  +  9v4  +  ■••  =  £  i(1’  —  1)0'  +  2)vf. 
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The  proof  of  this  theorem  reduces  to  the  same  kind  of  parameter  count  as 
in  the  special  case  analyzed  above  (which  corresponds  to  v2  =  1,  v3  = 
v4  =  •  •  •  =  0).  The  reader  can  easily  carry  out  this  calculation,  noting  first 
that  the  dimension  of  the  manifold  of  all  /c-dimensional  subspaces  in  an  n- 
dimensional  vector  space  is  equal  to  k{n  —  k )  (since  a  /c-dimensional  plane  in 
general  position  in  an  n-dimensional  space  can  be  thought  of  as  the  graph  of 
a  mapping  from  a  k-dimensional  space  to  an  (n  —  fc)-dimensional  space,  and 
such  a  mapping  is  given  by  a  rectangular  k  x  (n  —  k.)  matrix). 

Example.  Consider  the  case  n  —  2,  i.e.,  ellipses  in  the  plane.  An  ellipse  is 
determined  by  three  parameters  (e.g.,  the  lengths  of  the  two  axes  and  the 
angle  giving  the  direction  of  one  of  them).  Thus  the  manifold  of  ellipses  in  the 
plane  is  three-dimensional,  as  it  must  be  by  our  formula. 

A  circle,  however,  is  determined  by  one  parameter  (the  radius).  Thus  the 
manifold  of  circles  in  the  space  of  ellipses  is  a  line  in  a  three-dimensional 
space,  and  not  a  surface  as  it  would  seem  at  first  glance. 


This  "paradox"  becomes,  perhaps,  clearer  from  the  following  calculation.  The  quadratic 
forms  Ax2  +  2  Bxy  +  Cx2  with  different  eigenvalues  form  a  submanifold  of  the  three-dimensional 
space  with  coordinates  A,  B ,  and  C,  given  by  one  equation  A,  —  A2  =  0,  where  A,  2(A,  B ,  C) 
are  the  eigenvalues.  However,  the  left-hand  side  of  this  equation  is  the  sum  of  two  squares, 
as  is  clear  from  the  formula  for  the  discriminant  of  the  characteristic  equation: 

A  =  (A  4-  C)2  -  4(AC  -  B1)  =  (A  -  C)2  +  4 B2. 

Thus  the  single  equation  A  =  0  determines  a  line  in  the  three-dimensional  space  of  quadratic 
forms  (A  =  C.  B  =  0).  and  not  a  surface. 


A  simple  consequence  of  the  fact  that  the  manifold  of  ellipsoids  of  revolu¬ 
tion  has  codimension  2  is  that  this  manifold  does  not  divide  the  space  of  all 
ellipsoids  (and  the  manifold  of  quadratic  forms  with  a  multiple  spectrum  does 
not  divide  the  space  of  quadratic  forms),  as  a  line  does  not  divide  a  three- 
dimensional  space.  Therefore,  we  can  assert  not  only  that  in  an  ellipsoid  in 
“general  position”  all  the  axes  share  different  lengths,  but  also  that  any  two 
such  ellipsoids  can  be  connected  by  a  smooth  curve  in  the  space  of  ellipsoids  con¬ 
sisting  entirely  of  ellipsoids  with  axes  of  different  lengths.  Furthermore,  if  two 
ellipsoids  in  general  position  are  connected  by  a  smooth  curve  in  the  space 
of  ellipsoids  which  contains  a  point  which  is  an  ellipsoid  of  revolution,  then 
by  an  arbitrarily  small  displacement  of  the  curve  we  can  remove  it  from  the 
set  of  ellipsoids  of  revolution,  so  that  on  the  new  curve  all  the  points  will  be 
ellipsoids  without  multiple  axes. 

One  consequence  of  what  we  have  said  is  a  simple  proof  of  the  theorem 
that  characteristic  frequencies  increase  when  the  rigidity  of  a  system  is 
increased.  The  derivative  of  a  non-multiple  eigenvalue  of  a  quadratic  form 
with  respect  to  a  parameter  is  determined  by  the  derivative  of  the  quadratic 
form  in  the  corresponding  characteristic  direction.  If  the  rigidity  is  increased, 
the  potential  energy  increases  in  every  direction,  including  the  characteristic 
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directions.  Thus  the  characteristic  frequencies  also  increase.  Hence  we 
have  proved  the  theorem  on  the  growth  of  frequencies  in  the  case  when  it 
is  possible  to  go  from  the  original  system  to  a  more  rigid  system,  avoiding 
multiple  spectra.  The  proof  in  the  presence  of  multiple  spectrum  is  now 
obtained  by  a  passage  to  the  limit,  based  on  the  fact  that  the  interior  of  the 
path  from  the  original  system  to  the  more  rigid  system  can  be  removed  by 
an  arbitrarily  small  perturbation  from  the  set  of  systems  with  multiple 
spectra. 

In  summary,  we  can  say  that  a  typical  one-parameter  family  of  ellipsoids 
(or  quadratic  forms  in  euclidean  space)  does  not  contain  ellipsoids  of  revolu¬ 
tion  (quadratic  forms  with  multiple  spectra).  Applying  this  to  an  ellipsoid 
of  inertia  we  obtain  the  conclusion  above  about  the  necessity  for  two  adjust¬ 
able  masses. 

We  turn  now  to  two-parameter  systems.  It  follows  from  our  calculations 
that,  in  a  typical  two-parameter  system,  ellipsoids  of  revolution  are  en¬ 
countered  only  at  isolated  points  of  the  parameter  plane. 


Consider,  for  example,  a  convex  surface  in  three-dimensional  euclidean  space.  The  second 
fundamental  form  of  the  surface  determines  an  ellipse  in  the  tangent  space  at  every  point. 
Therefore,  we  have  a  two-parameter  family  of  ellipses  (which  can  be  translated  to  one  plane 
by  choosing  a  local  coordinate  system  near  a  point  on  the  surface).  We  come  to  the  conclusion 
that,  at  every  point  of  the  surface  except  at  certain  isolated  points,  the  ellipse  has  axes  of  different 
lengths.  Therefore,  on  surfaces  of  general  form,  there  are  two  orthogonal  fields  of  directions  (the 
major  and  minor  axes  of  the  ellipses)  with  isolated  singular  points.  In  differential  geometry 
these  directions  are  called  the  directions  of  principal  curvature,  and  these  singular  points  are 
called  umbilical  points.  For  example,  on  the  surface  of  an  ellipsoid  there  are  four  umbilical 
points:  they  lie  on  the  ellipse  containing  the  major  and  minor  axes,  and  two  of  them  are  clearly 
visible  in  the  picture  of  the  geodesics  on  an  ellipsoid  (cf.  Figure  207). 

In  exactly  the  same  way,  in  a  typical  three-parameter  family,  ellipsoids  of 
revolution  are  encountered  only  on  certain  lines  in  the  three-dimensional 
parameter  space.  For  example,  if  at  every  point  of  three-dimensional  eucli¬ 
dean  space,  we  are  given  an  ellipsoid  (i.e.,  a  symmetric  two-index  tensor), 
then  the  singularities  of  the  fields  of  principal  axes  will  be,  in  general,  on 
certain  lines  (where  two  of  the  three  fields  of  directions  have  discontinuities). 
These  lines,  like  the  umbilical  points  in  the  preceding  example,  are  of  several 
different  types.  Their  classification  (for  typical  fields  of  ellipsoids)  can  be 
obtained  from  the  classification  of  singularities  of  lagrangian  projections 
given  in  Appendix  12. 

In  a  typical  four-parameter  family,  ellipsoids  of  revolution  occur  on  two- 
dimensional  surfaces  in  the  space  of  parameters.  These  surfaces  have  no 
singularities  other  than  transverse  intersections  at  isolated  points  of  the 
parameter  space;  these  values  of  the  parameters  correspond  to  ellipsoids 

with  two  (different)  pairs  of  equal  axes. 

Triple  axes  appear  first  for  five  parameters,  at  isolated  points  of  the  param¬ 
eter  space.  The  values  of  the  parameters  corresponding  to  ellipsoids  w*tn  a 
double  axis  form  a  three-dimensional  manifold  in  the  five-dimensional 
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parameter  space  with  two  types  of  singularities:  transversal  intersections  of 
two  branches  along  some  curve  and  conic  singularities  at  isolated  points  (not 
lying  on  this  curve),  i.e.,  at  points  of  the  parameter  space  corresponding  to 
ellipsoids  with  three  equal  axes.  These  conic  singularities  have  the  following 
structure:  by  intersecting  the  three-dimensional  manifold  of  ellipsoids  of 
revolution  with  a  four-dimensional  sphere  of  small  radius  with  center  at  the 
singular  point,  we  obtain  two  copies  of  the  projective  plane.  The  resulting  em¬ 
beddings  of  the  projective  plane  in  the  four-dimensional  sphere  are  dififeo- 
morphic  to  the  embedding  given  by  the  five  spherical  harmonics  of  degree  two 
on  the  two-dimensional  sphere  (five  linear  combinations  of  the  functions  x,-X;, 
orthonormal  in  the  space  of  functions  on  the  sphere  x]  +  xf  +  x|  =  1, 
orthogonal  to  the  identity,  give  an  even  mapping  of  S2  into  S4  and,  therefore, 
an  embedding  RP2  -►  S4). 

It  remains  to  describe  the  behavior  of  the  eigenvalues  of  a  quadratic  form 
in  a  typical  two-parameter  family  as  the  parameter  approaches  a  singular 
point  where  the  two  eigenvalues  coincide.  A  little  calculation  shows  that  the 
graph  of  the  pair  of  eigenvalues  we  are  considering  has,  over  the  plane  of 
parameters  near  the  singular  point,  the  form  of  a  two-sheeted  cone,  whose 
vertex  corresponds  to  the  singular  point,  and  each  of  its  nappes  to  one  of  the 
eigenvalues  (Figure  243). 

x 


Figure  243  Characteristic  frequencies  of  one-  and  two-parameter  families  of  oscil¬ 
lating  systems  of  general  form 

A  typical  one-dimensional  subfamily  of  our  two-dimensional  family  has 
the  form  of  a  curve  in  the  plane  of  parameters  which  does  not  pass  through 
any  singular  points.  Every  one-parameter  family  which  contains  a  singular 
point  can  be  removed  from  it  by  a  small  perturbation;  the  resulting  one- 
parameter  family  will  be  a  curve  in  the  space  of  parameters  passing  near  the 
singular  point.  The  graph  of  the  eigenvalues  over  a  curve  on  the  plane  of 
parameters  passing  near  a  singular  point  consists  of  those  points  of  the  cone 
which  project  onto  this  curve.  Therefore,  this  graph  near  the  singular  point  is 
close  to  a  hyperbola,  resembling  a  pair  of  intersecting  straight  lines  (a  pair  of 
straight  lines  would  be  obtained  if  our  one-parameter  family  passed  through 
the  singular  point). 


x 
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This  discussion  of  eigenvalues  of  two-parameter  systems  of  quadratic 
forms  explains  the  strange  behavior  of  characteristic  frequencies  when  a 
single  parameter  is  varied :  in  general  (except  for  completely  singular  cases), 
when  a  single  parameter  is  varied  the  characteristic  frequencies  can  approach 
one  another  but  cannot  collide;  after  approaching,  they  must  again  go  off  in 
different  directions. 

B  Application  to  the  study  of  oscillations  of  continuous  media 

The  general  argument  above  has  numerous  applications  in  the  study  of  the 
dependence  on  parameters  of  the  characteristic  frequencies  of  various 
mechanical  systems  with  finitely  many  degrees  of  freedom;  however,  the  most 
interesting  applications  may  be  to  systems  with  infinitely  many  degrees  of 
freedom,  describing  oscillations  of  continuous  media.  These  applications  are 
based  on  the  fact  that  the  codimensions  of  manifolds  of  ellipsoids  with  given 
multiplicities  of  axes  are  determined  by  these  multiplicities  and  do  not  depend 
on  the  dimension  of  the  space. 

For  example,  the  codimension  of  the  set  of  ellipsoids  of  revolution  in  the 
manifold  of  all  ellipsoids  is  equal  to  two  in  a  space  of  any  dimension;  there¬ 
fore,  it  is  natural  to  assume  that  in  the  infinite  “manifold”  of  ellipsoids  in 
infinite-dimensional  hilbert  space,  the  set  of  ellipsoids  of  revolution  has 
codimension  2  (and,  in  particular,  the  space  of  ellipsoids  without  multiple 
axes  is  connected). 

Of  course,  arguments  of  this  kind  need  rigorous  justification.  We  will  not, 
however,  occupy  ourselves  with  this,  but  we  will  see  what  conclusions  follow 
from  the  argument  above  if  we  apply  it  to  the  problem  of  oscillations  in 
continuous  media. 

The  kinetic  energy  of  a  continuous  medium  filling  a  compact  region  D  is 
expressed  in  terms  of  the  deviation  u  of  a  point  x  from  equilibrium  by  the 
formula 


For  definiteness,  we  can  take  the  medium  to  be  a  membrane  (in  this  case  the 
region  D  is  two-dimensional,  and  the  deviation  u  one-dimensional).  The 
kinetic  energy  defines  a  euclidean  structure  on  the  configuration  space  of  the 
problem  (i.e.,  in  the  space  of  functions  u).  The  potential  energy  is  given  by  the 
Dirichlet  integral 


(from  the  mathematical  point  of  view  these  data  constitute  the  definition  of 
the  membrane). 

The  squares  of  the  characteristic  frequencies  of  the  membrane  arc  the 
eigenvalues  of  the  quadratic  form  U  on  the  configuration  space,  whose  metric 
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is  defined  using  the  kinetic  energy.  We  assume  that  a  typical  membrane  cor¬ 
responds  to  a  typical  quadratic  form  (this  assumption  means  transversality  of 
the  manifold  of  quadratic  forms  corresponding  to  different  membranes  to 
the  manifold  of  forms  with  multiple  eigenvalues).  If  we  believe  in  this  prop¬ 
erty  of  general  position,  we  come  to  the  following  conclusions. 

1.  For  membranes  in  general  position,  all  the  characteristic  frequencies  are 
different.  We  can  go  from  one  membrane  in  general  position  to  another 
by  a  continuous  path  consisting  entirely  of  membranes  with  simple 
spectra.  Furthermore,  a  typical  path  connecting  any  two  membranes  does 
not  contain  even  one  membrane  with  a  multiple  spectrum  (except, 
possibly,  the  ends  of  the  path). 

2.  By  varying  two  parameters  of  the  membrane  we  can  make  two  character¬ 
istic  frequencies  coincide;  to  obtain  a  triple  frequency,  we  must  have  at 
our  disposal  five  independent  parameters;  for  a  four-fold  frequency  we 
need  ten  parameters,  etc. 

3.  If,  by  starting  from  a  membrane  with  a  simple  spectrum  and  continuously 
deforming  it,  we  pass  to  another  membrane  with  a  simple  spectrum  along 
any  path  in  general  position,  then  as  a  result,  the  k- th  largest  characteristic 
frequency  of  the  second  membrane  is  always  obtained  independently  of 
the  path  of  deformation  from  the  k-th  largest  characteristic  frequency  of 
the  original  membrane ;  continuations  of  characteristic  functions,  however, 
do  generally  depend  on  the  path  of  deformation  (i.e.,  by  changing  the  path, 
the  sign  of  the  resulting  characteristic  function  can  be  changed). 

In  particular,  if  by  starting  from  a  membrane  with  a  simple  spectrum 
and  deforming  it  we  describe  a  closed  path  in  the  space  of  membranes  and 
return  to  the  original  membrane,  bypassing  the  set  of  membranes  with 
multiple  spectra  (which  has  codimension  2),  then  the  k-th  characteristic 
frequency  returns  to  its  original  value,  while  the  k-th  characteristic  func¬ 
tion  may  change  sign.  [Editor’s  note:  Conclusions  like  this  have  been 
proven  by  K.  Uhlenbeck  (Amer.  J.  Math.  98  (1976),  1059-1078).] 

C  The  effect  of  symmetries  on  the  multiplicity  of  the  spectrum 
A  multiple  spectrum  is  the  exception  in  systems  of  general  form,  but  it 
is  not  removable  under  small  perturbations  in  cases  when  the  given  system 
is  symmetric  and  the  deformations  preserve  the  symmetry. 

Consider,  for  example,  a  system  of  three  identical  masses  at  the  vertices 
of  an  equilateral  triangle,  connected  to  one  another  and  to  the  center  of  the 
triangle  by  identical  springs,  and  capable  of  moving  in  the  plane  of  the 
triangle.  The  system  has  rotational  symmetry  of  order  3.  Therefore,  there 
is  a  linear  operator  g  acting  on  the  configuration  space  (which  has  dimension 
6),  whose  third  power  is  equal  to  1  and  which  leaves  invariant  both  the 
euclidean  structure  of  the  configuration  space  and  the  ellipsoid  in  the  con¬ 
figuration  space  giving  the  potential  energy. 

It  follows  that  this  ellipsoid  must  be  an  ellipsoid  of  revolution.  If  we  let 
g  be  the  indicated  operator  on  the  configuration  space  and  £  a  vector  on  the 
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major  axis  of  the  ellipsoid,  then  the  axis  in  the  direction  g£  is  also  a  major 
axis  (since  the  rotation  g  takes  the  ellipsoid  to  itself). 

There  are  two  possibilities  for  the  vector  g£ :  either  g £  =  £,  or  the  vectors 
£  and  g£  are  linearly  independent.  In  the  second  case,  the  plane  spanned  by 
the  vectors  £  and  gt,  consists  entirely  of  major  axes.  Therefore,  the  eigenvalues 
corresponding  to  these  axes  are  at  least  double.  The  space  spanned  by  the 
three  vectors  g£,  and  g 2£  is  invariant  under  g.  It  is  either  two  dimensional 
(in  which  case  g  acts  by  a  120°  rotation)  or  three  dimensional  (in  which  case 
g  acts  by  the  same  rotation  around  £  +  g£  +  g2£,  as  an  axis).  In  the  latter  case, 
we  may  choose  the  direction  of  this  sum  for  one  of  the  principal  axes  of  the 
ellipsoid,  with  the  two  other  principal  axes  in  the  three-dimensional  space 
perpendicular  to  it.  It  is  therefore  possible  to  choose  the  principal  axes  for  an 
ellipsoid  which  is  invariant  under  an  orthogonal  transformation  of  order  three 
(in  a  space  of  any  number  of  variables),  so  that  each  axis  is  either  fixed  under 
the  transformation  or  is  rotated  by  120°  in  an  invariant  plane  spanned  by  it 
and  another  axis  (orthogonal  to  it,  as  well  as  to  all  other  axes)  of  the  same 
length.  In  what  follows,  we  shall  assume  that  the  axes  of  ellipsoids  and  the 
directions  of  the  corresponding  characteristic  oscillations  have  been  chosen 
in  the  manner  just  described. 

Our  argument  shows  that  characteristic  oscillations  of  a  system  with 
third-order  rotational  symmetry  can  be  of  two  types:  those  invariant  under 
rotation  by  120°  (g£,  =  £)  and  those  passing  under  such  a  rotation  to  inde¬ 
pendent  characteristic  oscillations  with  the  same  frequency  (g£  and  £  indepen¬ 
dent).  In  the  second  case,  there  actually  arise  three  forms  of  characteristic 
oscillations  with  the  same  frequency  (&  g£t  and  g2£),  but  only  two  of  them  are 
independent : 


£  +  g£  +  g2£  =  0 

since  the  sum  of  three  vectors  of  equal  length  on  the  plane  forming  angles  of 
120°  is  equal  to  zero. 

The  number  of  characteristic  oscillations  of  our  system  is  generally  equal 
to  6.  To  find  out  how  many  of  them  are  of  the  first  (symmetric)  and  second 
(nonsymmetric)  type,  we  can  use  the  following  argument.  Consider  the 
limiting  case,  when  each  of  the  masses  oscillates  independently  from  the 
others.  In  this  case,  we  can  choose  an  orthonormal  basis  of  the  configura¬ 
tion  space  consisting  of  six  characteristic  oscillations,  two  for  each  point,  for 
which  that  point  moves  and  the  other  two  do  not.  We  denote  by  ^  and 
the  characteristic  vectors  corresponding  to  the  i-th  point  with  charac¬ 
teristic  frequencies  a  and  b ,  respectively,  and  let  xh  y{  be  coordinates  in  the 
orthonormal  basis  rjf.  Then  the  potential  energy  can  be  written  in  the 
form 


U  =  \(a2x\  -I-  b2yl)  +  }(a2xj  +  b2y\)  4-  i( a2x\  +  b2yj). 
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The  symmetry  operator  g  permutes  the  coordinate  axes: 

9^i  ~  £2  9^2  =  <*3  9^ 3  =  £i> 

g*h  =  n2  9*h  =  *i3  9n  3  =  »h- 

We  can  now  represent  our  six-dimensional  space  as  the  orthogonal  direct 
sum  of  two  straight  lines  and  two  two-dimensional  planes,  invariant  under 
the  symmetry  operator  g.  That  is,  the  invariant  lines  are  defined  by  the 
directions  of  the  vectors 

£1  +  £2  +  *93  and  th  +  rj2  +  »|3, 

and  the  invariant  planes  are  their  orthogonal  complements  in  the  spaces 
spanned  by  the  vectors  and  rjh  respectively.  The  first  straight  line  is  the 
direction  of  a  symmetric  characteristic  oscillation  with  frequency  a ,  and  the 
second  the  direction  of  one  with  frequency  b.  In  exactly  the  same  way,  every 
vector  in  the  first  plane  is  a  direction  of  characteristic  oscillation  with  fre¬ 
quency  a  which,  under  rotation  by  120°,  goes  to  an  independent  oscillation 
of  the  same  frequency;  for  all  vectors  in  the  second  plane,  the  oscillation  is 
also  not  symmetric,  with  frequency  b. 

Thus,  in  this  degenerate  case  of  three  independent  points,  there  are  two 
independent  characteristic  oscillations  of  symmetric  type,  and  four  un- 
symmetric,  of  which  the  latter  are  divided  into  two  pairs.  In  each  pair  the 
oscillations  have  the  same  eigenvalue  and  are  obtained  from  one  another  by 
rotation  of  the  plane  of  our  points  by  120°. 

We  now  claim  that  the  conclusion  above  holds  true  for  any  law  of  inter¬ 
action  between  our  points  if  the  interaction  is  symmetric,  i.e.,  if  the  potential 
energy  of  the  system  is  preserved  under  rotation  of  the  plane  by  120°. 

In  fact,  decompose  the  6-dimensional  configuration  space  into  an  ortho¬ 
gonal  sum  of  the  plane  of  invariant  vectors  of  g  and  of  its  orthogonal  comple¬ 
ment.  The  potential  energy  will  decompose  into  a  sum  of  two  quadratic 
forms--one  in  two  variables,  the  other  in  four.  Now  consider  characteristic 
oscillations  in  the  two-dimensional  and  four-dimensional  configuration 
spaces,  with  potential  energy  described  above.  The  four-dimensional  space 
decomposes  into  two  gr-invariant  planes,  orthogonal  in  the  potential  energy 
metric.  We  have  obtained  a  system  of  six  characteristic  oscillations  having 
the  required  properties. 

Thus,  in  a  system  in  general  form  of  three  points  in  the  plane  with  rotational 
symmetry  of  order  3,  there  are  four  different  characteristic  frequencies,  two 
of  which  are  simple  and  two  double.  Each  of  the  simple  characteristic  fre¬ 
quencies  corresponds  to  a  symmetric  characteristic  oscillation,  and  each  of 
the  double  ones  to  three  characteristic  oscillations  obtained  from  one  another 
by  rotation  by  120°  and  summing  to  zero  (so  that  only  two  of  them  are 
independent). 
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Problem.  Classify  the  characteristic  oscillations  of  a  system  with  the  symmetries  of  an  equilateral 
triangle  (allowing  not  only  rotation  by  120\  but  also  reflection  through  the  altitude  of  the 
triangle). 

Problem.  Classify  the  characteristic  oscillations  of  a  system  whose  group  of  symmetries  is  the 
group  of  24  rotations  of  the  cube. 

Answer  .  The  oscillations  wilt  be  of  five  types.  By  rotations,  from  each  oscillation  one  can  obtain 
systems  of  8,  or  6,  or  4,  or  2,  or  1  independent  oscillations  (in  the  last  case  the  oscillations  are 
entirely  symmetric). 

Remark.  To  classify  oscillations  in  systems  with  any  group  of  symmetries,  a  special  apparatus 
has  been  developed  (the  so-called  theory  of  group  representations).  Cf.,  for  example.  Michael 
Tinkham,  Group  Theory  and  Quantum  Mechanics ,  McGraw-Hill,  1964. 


D  The  behavior  of  frequencies  of  a  symmetric  system  under  a 
variation  of  parameters  preserving  the  symmetry 
We  assume  now  that  our  symmetric  system  depends  in  a  general  way  on  some 
number  of  parameters,  and  that  the  symmetry  is  not  disturbed  when  the 
parameters  are  varied.  Then  the  characteristic  frequencies  of  various  multi¬ 
plicities  will  also  depend  on  the  parameters,  and  the  question  arises  of  when 
the  characteristic  frequencies  will  collide.  We  will  confine  ourselves  to 
formulating  a  result  for  the  simplest  case  of  systems  with  third-order  rota¬ 
tional  symmetry  (for  rotational  symmetry  of  any  order  n  >  3,  the  answer  is 
the  same).  The  details  can  be  found  in  the  following  articles:  V.  I.  Arnold, 
Modes  and  quasi-modes,  Functional  Analysis  and  Its  Applications,  6:2 
(1972),  94-101;  V.  N.  Karpushkin,  The  asymptotic  behavior  of  the  eigen¬ 
values  of  symmetric  manifolds  and  the  “most  probable”  representations  of 
finite  groups,  Moscow  Univ.  Math.  Bull.  29  (1974),  no.  2,  136-139. 

Characteristic  oscillations  of  any  system  with  rotational  symmetry  of 
order  3  are  divided  into  two  types:  symmetric  oscillations,  and  oscillations 
carried  by  rotation  by  120°  into  independent  ones.  For  a  general  system  with 
third-order  rotational  symmetry  (without,  in  particular,  any  additional 
symmetry)  all  the  characteristic  frequencies  of  the  first  type  are  simple,  and 
of  the  second,  double.  In  addition,  it  turns  out  that  if  a  system  depends  in  a 
general  way  on  one  parameter  and  is  symmetric  for  all  values  of  the  param¬ 
eter,  then  under  variation  of  the  parameter,  the  characteristic  frequencies  of 
symmetric  oscillations  do  not  collide  with  one  another,  and  the  double 
characteristic  frequencies  of  asymmetric  oscillations  do  not  split.  In  addition, 
the  double  characteristic  frequencies  of  asymmetric  oscillations  do  not 
collide  with  one  another  under  a  change  of  parameters.  However,  the  char¬ 
acteristic  frequencies  of  symmetric  and  asymmetric  oscillations  move  under 
changes  of  parameter  independently  from  one  another,  so  that  for  discrete 
values  of  the  parameter  the  characteristic  frequency  of  a  symmetric  oscilla¬ 
tion  and  the  (double)  characteristic  frequency  of  an  asymmetric  oscillation 
can  collide  (and  pass  through  one  another). 
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In  order  to  make  two  characteristic  frequencies  of  symmetric  oscillations 
collide,  we  must  vary  at  least  two  parameters ;  and  to  make  two  characteristic 
frequencies  of  asymmetric  oscillations  collide  we  must  vary  at  least  three. 

In  general,  in  the  typical  family  of  systems  with  third-order  rotational 
symmetry,  for  the  collision  of  i  simple  characteristic  frequencies  (i  symmetric 
oscillations)  and  j  double  frequencies  (j  unsymmetric  oscillations)  to  occur, 
the  number  of  parameters  of  the  family  must  be  at  least 

(i  -  l)(i  +  2)  ,  .2 

- 2 - +J’ 

We  apply  this  to  oscillations  of  symmetric  membranes.  Here  we  will 
assume  that  the  membrane  is  of  general  form,  admits  rotation  by  120°,  and 
corresponds  to  an  ellipsoid  of  general  form  in  the  space  of  ellipsoids  of  the 
configuration  space  admitting  the  transformation  of  the  configuration  space 
induced  by  the  rotation  of  the  membrane. 


The  exact  formulation  of  this  assumption  is  that,  for  all  membranes  except  a  set  of  infinite 
codimension,  the  mapping  from  the  space  of  symmetric  membranes  into  the  space  of  symmetric 
ellipsoids  is  transverse  to  each  of  the  manifolds  of  ellipsoids  with  a  given  number  of  multiple 
axes. 


If  we  agree  to  this  assumption,  we  come  to  the  following  conclusions  about 

oscillations  of  symmetric  membranes. 

1.  For  membranes  of  general  form  admitting  rotation  by  120°,  asymp¬ 
totically  one-third  of  the  characteristic  frequencies  (counting  them  with 
multiplicities)  are  simple,  and  the  corresponding  characteristic  oscilla¬ 
tions  admit  rotation  by  120°.  The  remaining  characteristic  frequencies  are 
double;  each  double  characteristic  frequency  corresponds  to  three  eigen¬ 
functions  whose  sum  is  zero  and  which  are  taken  to  one  another  under 
rotation  by  120°. 

2.  In  general  one-parameter  families  of  such  symmetric  membranes, 
for  isolated  values  of  the  parameters  there  are  collisions  of  a  single  fre¬ 
quency  with  a  double  frequency,  but  there  are  no  collisions  of  single 
frequencies  with  one  another  or  collisions  of  double  frequencies  with  one 
another. 

3.  The  minimal  number  of  parameters  of  a  family  of  membranes  for  which 
more  complicated  collisions  of  characteristic  frequencies  are  realized 
(stably  with  respect  to  small  perturbations  preserving  the  symmetry)  is 
given  by  the  formula 


(j  -  m  +  2) 

2 


+  r 


vu> 


where  v0-  is  the  number  of  points  of  collision  of  i  single  and  j  double 
frequencies. 
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In  particular,  for  a  typical  small  deformation  of  a  circular  membrane 
preserving  rotational  symmetry  of  order  3,  a  third  of  the  eigenvalues 
(corresponding  to  eigenfunctions  with  azimuthal  part  cos  3 k<p  and 
sin  3 kq>)  immediately  disperse.  Under  further  one-parameter  deforma¬ 
tion  the  simple  and  double  characteristic  frequencies  can  pass  through  one 
another,  but  two  simple  or  two  double  frequencies  cannot  collide  with  one 
another. 


E  Discussion 

The  value  of  the  concepts  of  general  position  and  symmetry  lies,  in  particular, 
in  the  fact  that  they  allow  us  to  obtain  some  information  in  those  cases 
where  we  cannot  find  an  exact  solution  of  a  problem.  In  particular,  for 
almost  no  membranes  do  we  know  the  forms  of  the  characteristic  oscillations. 
Nevertheless,  from  general  arguments  we  can  say  something,  for  example, 
about  the  multiplicities  of  eigenvalues. 

The  study  of  high-frequency  oscillations  of  continuous  media  is  very 
important  in  many  fields  (optics,  acoustics,  etc.),  and  special  methods  have 
been  developed  for  approximate  determination  of  the  form  of  character¬ 
istic  oscillations.  One  of  these  methods  (called  the  method  of  quasi-classical 
asymptotics )  consists  of  seeking  an  oscillation  which  is  locally  close  to  a 
simple  harmonic  wave  of  short  length,  but  which  changes  its  amplitude  and 
the  direction  of  its  front  from  point  to  point. 

Analysis  (which  we  will  not  go  into  here)  shows  that  in  some  cases  we 
can  construct  approximate  solutions,  with  the  indicated  properties,  of  the 
equation  for  eigenfunctions.  They  are  approximate  solutions  in  the  sense 
that  they  almost  satisfy  the  equation  for  eigenfunctions  (not  in  the  sense  that 
they  are  close  to  real  eigenfunctions). 

In  particular,  if  the  membrane  has  the  form  of  an  equilateral  triangle  with 
smoothed  and  strongly  blunted  corners,  then  we  can  construct  an  approxi¬ 
mate  solution  of  the  type  described  which  differs  appreciably  from  zero  only 
in  a  neighborhood  of  one  of  the  altitudes  of  the  triangle.  (Physicists  call  this 
approximate  solution  the  wave  analogue  of  a  beam  moving  along  the  altitude 
of  the  triangle;  this  beam  is  a  stable115  trajectory  on  a  billiard  table  having 
the  shape  of  our  membrane;  c.f.  the  following  appendix  on  short  wave 
asymptotics). 

It  follows  from  symmetry  and  general  position  arguments  that  typical 
membranes  with  rotational  symmetry  of  third  order  have  no  real  character¬ 
istic  oscillations  of  the  type  described.  Assume  that  one  of  the  characteristic 


115  The  condition  for  linear  stability  of  a  billiard  trajectory  has  the  form 

(ri  +  r2  -  /)(r ,  -  /)(r2  -  /)  >  0, 

where  1  is  the  length  of  the  interval  of  the  trajectory  and  r,  and  r2  are  the  radii  of  curvature  of 
the  walls  at  its  ends. 
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oscillations  of  the  membrane  is  concentrated  near  an  altitude  (but  not  near 
the  center  of  the  membrane).  Then,  rotating  it  by  120°  and  240°  we  obtain 
three  characteristic  oscillations  with  the  same  characteristic  frequency.  These 
three  oscillations  are  independent  (this  follows  from  the  fact  that  their  sum 
is  not  zero).  Therefore,  the  characteristic  frequency  has  multiplicity  3,  which 
does  not  occur  in  typical  systems  with  third-order  rotational  symmetry. 

From  this  argument  it  is  clear  that  attempting  to  construct  rigorous  high- 
frequency  asymptotics  for  eigenfunctions  is  a  rather  hopeless  task;  what  we 
can  hope  to  do  is  to  obtain  approximate  formulas  for  almost  characteristic 
oscillations.  Such  an  almost  characteristic  oscillation  can  differ  very  strongly 
from  real  characteristic  oscillations,  but  if  we  give  the  membrane  the  initial 
condition  corresponding  to  it,  then  for  a  long  time  the  oscillation  will  resemble 
a  standing  wave  (characteristic  oscillation). 

An  example  of  an  almost  characteristic  oscillation  is  the  motion  of  one 
of  two  identical  pendulums  connected  by  a  very  weak  spring.  If,  at  the  initial 
moment,  we  set  the  first  pendulum  in  motion  and  leave  the  second  fixed,  then 
for  a  long  time  it  will  appear  that  only  the  first  pendulum  is  oscillating,  and 
the  oscillation  will  be  almost  characteristic.  For  true  characteristic  oscilla¬ 
tions,  both  pendulums  oscillate  with  the  same  amplitude. 

The  problem  of  connecting  the  geometry  of  a  membrane  with  the  properties  of  its  character¬ 
istic  oscillations  has  been  intensively  studied  in  recent  years  by  many  authors  (including  H.  Weyl. 
S.  Minakshisundaram  and  A.  Pleijel,  A.  Selberg,  J.  Milnor,  M.  Kac,  I.  Singer,  H.  McKean, 
M.  Berger,  Y.  Colin  de  Verdiere,  J.  Chazarain,  J.  J.  Duistermaat,  V.  F.  Lazutkin,  A.  I.  Schnirelman, 
and  S.  A.  Molchanov). 

To  the  simplest  question,  "Can  you  hear  the  shape  of  a  drum?”  the  answer  turns  out  to  be 
negative:  there  exist  non-isometric  riemannian  manifolds  with  the  same  spectrum.  On  the 
other  hand,  several  properties  of  a  manifold  can  be  recovered  from  the  eigenvalues  of  the  laplacian 
and  from  the  properties  of  eigenfunctions  (for  example,  the  complete  set  of  lengths  of  closed 
geodesics  can  be  recovered). 
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From  the  point  of  view  of  physical  optics,  the  description  of  the  propagation 
of  light  in  geometric  optics,  using  rays  (i.e.,  Hamilton’s  canonical  equations) 
or  wave  fronts  (i.e.,  the  Hamilton-Jacobi  equation),  is  only  an  approximation. 
According  to  the  ideas  of  physical  optics,  light  is  electromagnetic  waves, 
and  geometric  optics  is  a  first  approximation,  a  good  description  of 
phenomena  only  when  the  length  of  the  waves  is  small  compared  to  the  size 
of  the  objects  being  considered. 

A  mathematical  version  of  these  physical  ideas  consists  of  asymptotic 
formulas  for  solving  the  corresponding  differential  equations  formulas 
which  give  better  approximations  for  higher-frequency  oscillations  (i.e.,  for 
shorter  waves).  These  asymptotic  formulas  can  be  written  in  terms  of  rays 
(i.e.,  motions  in  some  hamiltonian  dynamical  system)  or  fronts  (i.e.,  solutions 
of  the  Hamilton-Jacobi  equation). 

Similar  short  wave  asymptotics  exist  for  solutions  of  many  equations  in 
mathematical  physics,  describing  all  wave  processes.  In  different  areas  of 
physics  and  mathematics  they  are  connected  with  different  names.  For 
example,  in  quantum  mechanics,  short  wave  asymptotics  are  called  quasi- 
classical  approximations ;  they  are  determined  by  the  so-called  WKBJ  method 
(Wentzel,  Kramers,  Brillouin,  Jeffreys),  although  these  approximations  were 
used  much  earlier  by  Liouville,  Green,  Stokes,  Rayleigh  and  others. 

The  construction  of  short  wave  asymptotics  is  based  on  the  idea  that, 
locally,  a  series  of  almost  strictly  sinusoidal  waves  is  observed  at  each  place, 
although  the  amplitudes  of  these  waves  and  the  directions  of  their  fronts 
change  slowly  from  point  to  point.  Formal  substitution  of  a  function  of  this 
form  into  the  partial  differential  equations  describing  the  wave  process 
reduces  us  (in  a  first  approximation  for  waves  of  small  length)  to  the 
Hamilton-Jacobi  equation  for  wave  fronts.  The  higher-order  approximations 
allow  us  to  determine  as  well  the  dependence  of  the  amplitude  of  oscillation 
on  the  point. 

Of  course,  the  entire  procedure  requires  a  mathematical  foundation.  The 
exact  formulation  and  proof  of  the  corresponding  theorems  are  not  at  all  easy. 
Particular  difficulty  is  introduced  by  “caustics”  (i.e.,  focal  or  conjugate 
points,  or  turning  points). 

Caustics  are  envelopes  of  families  of  rays;  they  can  be  seen  on  a  wall 
illuminated  by  rays  reflected  from  some  smooth  curved  surface.  If  the  rays 
orthogonal  to  the  wave  fronts  intersect  and  form  caustics,  then  near  the 
caustics  the  formulas  for  short  wave  asymptotics  must  be  slightly  changed. 
Namely,  the  phase  of  oscillations  along  each  ray  undergoes  a  standard  dis¬ 
continuity  (one-fourth  of  a  wave)  upon  each  passage  of  the  ray  through  a 
caustic. 

A  precise  description  of  all  these  phenomena  may  be  conveniently  devel¬ 
oped  in  terms  of  the  geometry  of  lagrangian  submanifolds  of  the  correspond¬ 
ing  phase  space  and  their  projections  onto  the  configuration  space.  Here, 
caustics  are  interpreted  as  singularities  of  the  projection,  from  phase  space 
to  configuration  space,  of  that  lagrangian  manifold  which  represents  a 
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family  of  rays.  Thus,  the  normal  forms  of  singularities  of  lagrangian  pro¬ 
jections  introduced  in  Appendix  12  supply  a  classification  of  singularities  of 
caustics  formed  by  systems  of  rays  in  “general  position.” 

In  this  appendix  we  introduce  (without  proof)  the  simplest  formulas  of 
short  wave  asymptotics  for  the  Schrodinger  equation  of  quantum  mechanics. 
A  more  detailed  exposition  can  be  found  in  the  following  places: 

J.  Heading,  Introduction  to  phase  integral  methods,  Methuen  Co.  Ltd.,  1962.  (Cf.  especially 
Appendix  II  (by  V.  P.  Maslov)  in  the  Russian  translation  of  Heading’s  book,  Moscow  1965). 

V.  P.  Maslov,  Theorie  des  perturbations  et  methodes  asymptotiques,  Pairs,  Dunod,  1972  (Russian 
edition:  Moscow  University,  1965). 

V.  I.  Arnold,  On  a  characteristic  class  entering  into  conditions  of  quantization,  Functional  Analy¬ 
sis  and  its  Applications,  v.  I  (1967). 

L.  Hormander,  Fourier  integral  operators.  Acta  Math.  127  (1971),  79-183. 

A  Quasi-classical  approximation  for  solutions 
of  Schrodinger' s  equation 

Schrodinger' s  equation  for  a  particle  in  a  field  with  potential  energy  U  in 
euclidean  space  is  an  equation  for  a  complex- valued  function  t //(q,  t ): 

di 1/  h2 

ih  —  =  — — -  At 1/  +  U(q)il/,  q  e  IR",  t  e  [R. 
ot  2 


Here,  h  is  some  real  constant  which  is  also  a  small  parameter  of  the  problem 
being  considered,  and  A  is  the  Laplace  operator. 

We  assume  that  the  initial  condition  has  the  short  wave  form 

*l,-o  = 

where  the  smooth  function  (p  is  nonzero  only  inside  some  bounded  region. 
We  will  find  below  an  asymptotic  (as  h  ->  0)  formula  for  the  solution  of 
Schrodinger’ s  equation  with  such  an  initial  condition. 

First  of  all,  we  consider  the  motion  of  a  classical  particle  in  the  field  with 
potential  energy  U,  i.e.,  we  consider  Hamilton’s  equations 

au  pjj_i 

4=^-  p  =  -  where  H  =  \p2  +  U(q) 
dp  dq 

in  2n-dimensional  phase  space.  The  solutions  of  these  equations  determine 
a  phase  flow  (under  some  conditions  on  the  potential,  which  we  assume  ful¬ 
filled;  these  conditions  prevent  the  particle  from  going  off  to  infinity  in  a 
finite  time). 

We  associate  to  our  short  wave  initial  condition  a  lagrangian  submanifold 
of  the  phase  space  (i.e.,  a  manifold  whose  dimension  is  equal  to  the  dimension 
of  the  configuration  space  and  on  which  the  2-form  dp  a  dq  defining  the  sym- 
plectic  structure  on  the  phase  space  is  identically  zero).  Namely,  we  define 
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the  “momentum”  corresponding  to  our  initial  condition  as  the  gradient  of 
the  phase,  i.e.,  we  set 


M  = 


(  s 
Tq 


Lemma.  For  any  smooth  function  s,  the  graph  of  the  function  p(q)  constructed 
by  it  in  the  phase  space  R2"  =  {(p,  q)}  is  a  lagrangian  manifold.  Conversely , 
if  a  lagrangian  manifold  projects  diffeomorphically  onto  the  q-space  (i.e.,  it 
is  a  graph),  then  it  is  given  by  some  generating  function  s,  according  to  the 
formula  above. 


We  denote  the  lagrangian  manifold  constructed  from  the  initial  condition 
(with  the  function  s)  by  M.  After  time  t  the  phase  flow  gl  carries  the  manifold 
M  to  another  manifold  g‘M.  This  new  manifold  is  also  lagrangian,  since  the 
phase  flow  preserves  the  symplectic  structure. 

For  small  t,  the  new  lagrangian  manifold,  like  the  old,  projects  diffeo¬ 
morphically  onto  the  configuration  space.  However,  for  large  t  this  is  not 
necessarily  true  (Figure  244). 


p 


Figure  244  Transformation  of  lagrangian  manifolds  by  the  phase  flow 

In  other  words,  several  points  of  the  new  lagrangian  manifold  may  project 
to  one  point  Q  of  the  configuration  space.  We  assume  that  there  are  only 
finitely  many  of  these  points  and  that  they  are  all  nondegenerate  (i.e.,  that  at 
each  of  the  points  of  the  new  lagrangian  manifold  which  project  onto  Q,  the 
derivative  of  the  projection  mapping  onto  the  configuration  space  is  non¬ 
degenerate). 

The  nondegeneracy  condition  is  satisfied  for  almost  all  points  Q.  Those  exceptional  points  Q 
for  which  it  is  not  satisfied  form  a  set  of  measure  zero  in  the  configuration  space.  In  the  general 
case,  this  set  is  a  surface  whose  dimension  is  one  less  than  the  dimension  of  the  configuration 
space.  This  surface,  playing  the  role  of  a  caustic  in  our  problem,  can  itself  have  complicated 
singularities. 
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The  points  of  the  new  lagrangian  manifold  projecting  to  the  point  Q  arose 
under  the  phase  flow  transformation  from  several  points  of  the  original 
lagrangian  manifold  (constructed  from  the  initial  condition).  In  other  words, 
after  time  t,  several  trajectories  of  classical  particles,  with  initial  conditions 
belonging  to  the  original  lagrangian  manifold,  arrive  at  Q. 

We  let  ( pj ,  qj)  denote  these  initial  points  in  the  phase  space,  and  Sj  the 
action  along  the  trajectories  of  the  phase  flow  coming  from  the  point  (pj7  qj). 
More  precisely,  we  set 


SjiQ,  0  =  sfaj)  + 


Lde, 

o 


where  L  = 


2 


U(q)  and  ge(pj,  qj)  =  ( p(9 ),  q(0)). 


Then,  as  h  -*■  0,  the  solution  of  Schrodinger’s  equation  with  the  oscillating 
initial  condition  given  by  the  functions  s  and  q>  has  asymptotic  form 


•KQ,  0  =  I  <p(<lj) 

J 


m 

Dqj 


1/2 

e(i/ft).S)(Q,r)-(i7t/2)Aij  O(ll) 


where  pj  is  an  integer  (the  Morse  index)  which  will  be  defined  below. 

In  order  to  explain  this  formula,  we  first  consider  the  case  when  the  time 
interval  t  is  small.  In  this  case,  the  sum  is  reduced  to  a  single  term,  since  the 
lagrangian  manifold  obtained  from  the  original  lagrangian  manifold  by  the 
phase  flow  transformation  after  small  time  projects  diffeomorphically  onto 
the  configuration  space.  In  other  words,  of  the  family  of  particles  correspond¬ 
ing  to  the  initial  condition  for  Schrodinger’s  equation,  only  one  arrives  at  Q 
after  the  small  time  t. 

For  small  t,  the  Morse  index  is  equal  to  zero  (as  we  will  see  below  from  its 
definition).  In  this  way  the  function  i j/(Q,  t)  has,  like  the  initial  condition,  a 
rapidly  oscillating  form.  Thus,  the  function  S  defining  the  wave  fronts  at  time 
t  is  none  other  than  the  value  at  time  t  of  the  solution  of  the  Hamilton-Jacobi 
equation,  the  initial  condition  for  which  is  given  by  the  function  s  defining 
the  wave  front  at  the  initial  moment.  The  amplitude  of  the  wave  at  time  t  at 
the  point  Q  is  obtained  from  the  amplitudes,  at  the  initial  moment  at  the 
original  point,  of  the  trajectories  coming  to  Q  multiplied  by  a  certain  factor. 
This  factor  is  chosen  so  that,  under  motions  of  the  particles  corresponding 
to  our  initial  conditions,  the  integral  of  the  square  of  the  modulus  of  the 
function  i/f,  over  a  region  of  configuration  space  filled  with  particles,  does  not 
change  with  time.  (Here  we  assume  that  at  the  initial  moment,  some  region  in 
the  configuration  space  has  been  selected;  then  the  phase  points  on  the 
original  lagrangian  manifold  are  selected  whose  projections  onto  the  con¬ 
figuration  space  lie  in  this  region;  their  images  under  the  action  of  the  phase 
flow  after  time  t  are  found;  finally,  the  projections  of  these  images  onto  the 
configuration  space  form  the  region  “  filled  with  particles  at  time  l”) 
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B  The  Morse  and  Maslov  indices 

The  number  pj  is  defined  as  the  number  of  focal  points  to  the  manifold  M 
on  the  interval  [0,  tj  of  the  phase  curve  starting  out  at  the  point  (pj,  qj). 

Focal  points  to  the  manifold  M  are  defined  as  follows.  We  chose  the  point 
Q  so  that,  under  projection  of  the  lagrangian  manifold  obtained  from  M  at 
time  t,  a  nondegeneracy  condition  is  satisfied  at  this  point.  However,  if  we 
consider  the  entire  phase  curve  coming  from  the  point  ( pj ,  q}),  then  at  some 
moments  of  time  6  between  0  and  t,  the  nondegeneracy  condition  may  not  be 
satisfied  at  the  point  (p(6),  q(6))  of  the  lagrangian  manifold  geM.  Such  points 
are  called  focal  points  to  the  manifold  M  along  this  phase  curve. 


We  note  that  the  definitions  of  focal  points  to  M  and  the  Morse  index  do  not  depend  on 
Schrodinger’s  equation,  but  relate  simply  to  the  geometry  of  the  phase  flow  in  the  cotangent 
bundle  to  the  configuration  space  (or  to  the  calculus  of  variations,  which  is  the  same  thing). 

In  particular,  as  our  lagrangian  manifold  M  we  may  take  the  fiber  of  the  cotangent  bundle 
passing  through  the  point  (p0,  q0)  (given  by  the  condition  q  =  q0).  In  this  case  a  focal  point  to 
M  on  the  phase  curve  going  out  from  (p0,  q0)  is  called  conjugate  to  the  original  point  (more 
precisely,  the  projection  of  this  focal  point  onto  the  configuration  space  is  said  to  be  conjugate 
to  the  point  q0  along  the  extremal  in  the  configuration  space  starting  at  qn  with  momentum  p0). 
In  the  even  more  special  case  of  motion  along  a  geodesic  on  a  riemannian  manifold,  a  focal 
point  to  a  fiber  of  the  cotangent  bundle  is  called  conjugate  to  the  initial  point  of  the  geodesic 
along  this  geodesic.  For  example,  the  south  pole  of  a  sphere  is  conjugate  to  the  north  pole  along 
any  meridian. 

The  Morse  index  of  an  interval  of  a  geodesic,  equal  to  the  number  of  points  conjugate  to  the 
initial  point,  plays  an  important  role  in  the  calculus  of  variations.  Namely,  we  consider  the 
second  differential  of  the  action  as  a  quadratic  form  on  the  space  of  variations  (with  fixed  end¬ 
points)  of  the  geodesic  we  are  studying.  Then  the  index  of  inertia  of  this  quadratic  form  is  equal 
to  the  Morse  index  (cf.,  for  instance,  J.  Milnor,  Morse  Theory ,  Princeton  University  Press,  1967). 

Thus  the  geodesic,  up  to  the  first  conjugate  point,  is  a  minimum  of  the  action,  which  justifies 
the  name  “principle  of  least  action”  for  various  variational  principles  of  mechanics. 


We  note  that  in  calculating  the  Morse  index,  the  focal  points  must  be 
counted  with  multiplicity  (the  multiplicity  of  a  focal  point  in  general  position 
is  equal  to  1). 

The  Morse  index  is  a  particular  case  of  the  so-called  Maslov  index,  which 
is  defined  independently  of  the  phase  flow  for  any  curve  on  a  lagrangian  mani¬ 
fold  of  the  cotangent  bundle  over  the  configuration  space. 

Consider  the  projection  of  our  n-dimensional  lagrangian  manifold  onto 
the  n-dimensional  configuration  space.  This  is  a  smooth  mapping  of  mani¬ 
folds  of  the  same  dimension.  It  can  have  singular  points,  i.e.,  points  at  which 
the  rank  of  the  derivative  mapping  drops,  and  in  a  neighborhood  of  which 
the  projection  is  not  a  diffeomorphism. 

It  turns  out  that  in  general  the  set  of  singular  points  has  dimension  n  —  1 
and  consists  of  the  union  of  a  smooth  manifold  of  dimension  n  —  1  made  up  of 
simple  singular  points  at  which  the  rank  drops  to  1,  and  a  finite  set  of  mani¬ 
folds  whose  dimensions  are  n  —  3  and  smaller.  Here,  “in  general”  means  that 
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these  properties  can  be  attained  by  an  arbitrarily  small  perturbation  of  the 
lagrangian  manifold,  under  which  it  remains  lagrangian. 

We  should  point  out  that,  among  the  pieces  of  various  ranks  into  which  the  set  of  singular 
points  is  divided,  there  is  no  piece  of  dimension  n  —  2.  After  the  simplest  singular  points,  forming 
a  manifold  of  dimension  n  —  1,  there  are  the  points  where  the  rank  drops  by  two;  they  form  a 
manifold  of  dimension  n  —  3.  The  projection  of  the  set  of  singular  points  onto  the  configuration 
space  (the  caustic)  consists,  in  general,  of  pieces  of  all  dimensions  from  0  to  n  —  1  without 
omissions. 

Furthermore,  it  turns  out  that  the  (n  —  l)-dimensional  manifold  of  the 
simplest  singular  points  is  two-sided  in  the  lagrangian  manifold;  that  is,  we 
can  coordinate  the  orientations  of  the  normals  at  all  points  in  the  following 
way. 

Consider  some  simple  singular  point  on  the  lagrangian  manifold.  We  take 
a  system  of  coordinates  qt, . . . ,  q„  in  a  neighborhood  of  the  projection  of  this 
point  onto  the  configuration  space.  Let  py, . . . ,  p„  be  corresponding  coordi¬ 
nates  in  the  fiber  of  the  cotangent  bundle.  In  a  neighborhood  of  our  singular 
point,  we  can  consider  the  lagrangian  manifold  as  the  graph  of  the  vector 
function  (qu  p2 , . . . ,  p„)  of  the  variables  (pt,  q2 ,  •  •  • ,  <?„)  (or  a  vector  function 
of  an  analogous  form  in  which  the  role  of  the  distinguished  coordinate  is 
played  not  by  the  first  coordinate  but  by  any  of  the  remaining  coordinates). 

Singular  points  near  the  given  one  are  then  defined  by  the  condition 
ddi/dPi  =  0.  For  lagrangian  manifolds  in  general  position,  this  derivative 
changes  sign  upon  passing  from  one  side  of  the  manifold  of  singular  points  to 
the  other  in  our  neighborhood  of  the  simple  singular  point.  We  will  call  the 
side  where  this  derivative  is  positive  the  positive  side. 

We  note  that  it  is  necessary  to  prove  that  the  definitions  of  positive  direction  near  different 
points  agree  with  one  another.  Furthermore,  it  must  be  shown  that  the  positive  direction  near 
one  point  is  well  defined,  i.e.,  does  not  depend  on  the  coordinate  system.  All  this  can  be  done  by 
direct  calculations  (cf.  the  article  cited  above  in  “Functional  Analysis”).  For  further  development 
of  these  ideas,  see  V.  I.  Arnold,  Sturm  Theory  and  Symplectic  Geometry,  Funct.  Anal.  Appl. 
19  (1985). 

Now  the  Maslov  index  of  an  oriented  curve  on  a  lagrangian  manifold  is 
defined  as  the  number  of  passages  from  the  negative  side  of  the  manifold  of 
singularities  to  the  positive  side,  minus  the  number  of  passages  in  the  other 
direction.  In  this  we  assume  that  the  ends  of  the  curve  are  nonsingular  and 
that  the  curve  intersects  only  the  manifold  of  simple  singular  points  and  only 
with  nonzero  angles.  Having  defined  the  index  for  such  curves,  we  can  define 
it  for  an  arbitrary  curve  connecting  two  nonsingular  points:  to  do  this  it  is 
sufficient  to  approximate  the  curve  by  one  which  intersects  only  the  manifold 
of  simple  singular  points  and  only  with  nonzero  angles.  It  can  be  shown  that 
the  index  does  not  depend  on  the  choice  of  the  approximating  curve. 

Problem.  Find  the  index  of  the  circle  p  =  cos  t,  q  =  sin  t  oriented  by  the  parameter  t, 
0  <  t  <  2n,  in  the  lagrangian  manifold  p1  +  q2  =  1  of  the  phase  plane. 

Answer.  +  2. 
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Finally,  the  Morse  index  of  a  phase  curve  in  R2"  can  now  be  defined  as  the 
Maslov  index  of  a  curve  in  an  (n  +  l)-dimensional  lagrangian  manifold  in  a 
suitable  (2 n  +  2)-dimensional  phase  space.  As  coordinates  in  this  space  we 
will  take  (p0,P‘,q  0  >  <?)  (where  (p,  q )  e  R2").  If  we  set  q0  =  t  and  p0  =  —  H(p ,  q), 
and  let  the  point  (p,  q)  range  over  the  n-dimensional  lagrangian  manifold  in 
R2n  obtained  from  the  original  after  time  t  by  the  action  of  the  phase  flow, 
then  under  change  of  t  the  points  in  R2n  +  2  form  an  ( n  +  l)-dimensional 
lagrangian  manifold.  The  graph  of  the  motion  of  a  phase  point  under  the 
action  of  the  phase  flow  can  be  considered  as  a  curve  on  this  ( n  +  ^-dimen¬ 
sional  lagrangian  manifold.  We  can  verify  that  the  Maslov  index  of  this  graph 
agrees  with  the  Morse  index  of  the  original  phase  curve. 

C  Indices  of  closed  curves 

The  indices  of  closed  curves  on  lagrangian  submanifolds  of  a  linear  phase 
space  can  also  be  calculated  with  the  help  of  a  complex  structure.  In  addition 
to  the  symplectic  structure  dp  a  dq  on  the  linear  phase  space  R2n  =  {(p,  <?)}, 
we  introduce  a  euclidean  structure  (with  scalar  square  p2  +  q2)  and  a 
complex  structure,  in  which  multiplication  by  i  is 

/:R2"-»-R2"  I(p,  q)  =  (-<?,  p)  z  =  p  +  iq  C"  =  {z}. 

All  three  structures  are  connected  by  the  relation 

[x,  y]  =  (/x,  y), 

where  the  square  brackets  denote  the  skew-scalar  product. 

Linear  transformations  of  the  phase  space  preserving  any  two  (and, 
therefore,  all  three)  structures  are  called  unitary  transformations.  Such  trans¬ 
formations  take  lagrangian  planes  to  lagrangian  planes. 

Every  lagrangian  plane  can  be  obtained  from  any  other  (e.g.,  from  the 
real  plane  R"  given  by  the  equation  q  =  0)  by  a  unitary  transformation.  In 
addition,  any  two  unitary  transformations  A  and  B  carrying  the  real  plane 
to  the  same  lagrangian  plane  differ  by  a  unitary  transformation  which  is  a 
real  orthogonal  transformation: 

B  =  AC,  where  CR”  =  R". 

Conversely,  any  preliminary  orthogonal  transformation  does  not  change  the 
image  of  the  plane  under  the  action  of  a  unitary  transformation. 

We  now  note  that  the  determinant  of  an  orthogonal  transformation  is 
equal  to  + 1.  Therefore  the  square  of  the  determinant  of  a  unitary  transforma¬ 
tion  carrying  the  real  plane  to  a  given  lagrangian  plane  depends  only  on  the 
lagrangian  plane  itself  and  does  not  depend  at  all  on  the  choice  of  unitary 
transformation. 

After  these  preliminary  remarks  we  return  to  our  lagrangian  manifold 
and  closed  oriented  curve  lying  in  it.  At  every  point  of  the  curve,  there  is  a 
plane  tangent  to  the  lagrangian  manifold  in  the  symplectic  vector  space.  The 
square  of  the  determinant  of  the  unitary  transformation  carrying  the  real 
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plane  to  this  tangent  plane  is  a  complex  number  with  modulus  one.  As  a 
point  moves  along  our  closed  curve,  this  complex  number  changes.  After  an 
entire  circuit  of  the  curve,  the  square  of  the  determinant  makes  some  integral 
number  of  rotations  around  the  origin  on  the  plane  of  complex  variables, 
oriented  from  1  to  i.  This  integer  is  the  index  of  the  closed  curve. 

The  indices  of  closed  curves  enter  into  asymptotic  formulas  for  stationary 
problems  (characteristic  oscillations).  Assume  that  the  phase  flow  cor¬ 
responding  to  the  potential  V  has  an  invariant  lagrangian  manifold  lying  on 
the  energy  level  H  —  E.  Then  the  equation 

=  X2(U(q)  -  £)<A 

has  a  series  of  eigenvalues  XN  -*•  oo  with  asymptotic  form  XN  —  q.N  +  0(p.N  x) 
if,  for  every  closed  contour  y  on  the  lagrangian  manifold,  we  have  the  con¬ 
gruence 


n 


}  p  dq  =  ind  y  (mod  4). 

y 


In  the  one-dimensional  case,  the  lagrangian  manifold  is  a  circle,  its  index 
is  equal  to  2,  and  the  formula  above  reduces  to  the  so-called  “quantization 
condition” 


Hx  (fc  p  dq  =  2n(N  +  ^). 

J  y 

The  eigenfunctions  corresponding  to  these  eigenvalues  are  also  associated  with  lagrangian 
manifolds,  but  this  association  is  not  so  simple.  In  fact,  we  cannot  write  down  asymptotic 
formulas  for  eigenfunctions,  but  only  for  functions  approximately  satisfying  the  equations  of 
characteristic  functions.  These  functions  turn  out  to  be  small  outside  the  projection  of  the  lagran¬ 
gian  manifold  onto  the  configuration  space.  The  asymptotic  formulas  have  singularities  near 
the  caustics  formed  by  the  projection. 

The  actual  eigenfunctions,  however,  can  behave  entirely  differently,  at  least  if  the  eigen¬ 
value  is  multiple  or  if  there  are  eigenvalues  close  to  it  (cf.  Appendix  10). 
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Lagrangian  singularities  are  singularities  of  projections  of  lagrangian  mani¬ 
folds  onto  configuration  space.  Such  singularities  are  encountered  in 
investigating  global  solutions  to  the  Hamilton-Jacobi  equation,  in  studying 
caustics,  focal  or  conjugate  points,  in  analyzing  the  propagation  of  dis¬ 
continuities  and  shock  waves  in  the  mechanics  of  a  solid  medium,  and  also  in 
problems  of  short  wave  asymptotics  (cf.  Appendix  11). 

In  order  to  describe  lagrangian  singularities  we  must  first  say  a  few  words 
about  singularities  of  smooth  mappings  in  general.  We  begin  with  the 
simplest  examples. 

A  Singularities  of  smooth  mappings  of  a  surface  onto  a  plane 

The  mapping  projecting  a  sphere  onto  a  plane  is  singular  on  the  equatorial 
circle  (at  points  of  the  equator  the  rank  of  the  derivative  drops  to  one).  As  a 
result,  a  curve  is  formed  on  the  plane  of  projection  (the  so-called  apparent 
contour)  bounding  regions  in  which  points  have  different  numbers  of  pre¬ 
images:  every  point  of  the  plane  inside  the  apparent  contour  has  two 
pre-images,  and  every  point  outside  has  none. 

In  more  complicated  cases  of  “apparent  contours”  there  can  be  more 
complicated  singularities.  Consider,  for  example,  the  surface  given  in  three- 
dimensional  space  with  coordinates  (x,  y,  z)  by  the  equation  (Figure  245) 

x  =  yz  —  z3 

and  the  mapping  of  projection  parallel  to  the  z-axis  onto  the  plane  with 
coordinates  (x,  y). 

The  singular  points  of  the  projection  form  a  smooth  curve  on  the  surface 
(with  equation  3 z2  =  y).  However,  the  image  of  this  curve  on  the  (x,  y)  plane 
is  not  a  smooth  curve.  This  image  is  a  semi-cubical  parabola  with  a  cusp  at 
the  point  (0,  0)  with  equation 

27x2  -  4y3. 

Such  a  curve  divides  the  plane  into  two  parts:  a  smaller  part  (inside  the 
cusp)  and  a  larger  part  (outside).  Over  each  point  of  the  smaller  part  there 
are  three  points  of  our  surface,  and  over  each  point  of  the  larger  part  there  is 
only  one. 

We  now  consider  any  small  deformation  of  our  surface.  It  turns  out  that, 
under  projection  of  any  surface  close  to  ours,  the  apparent  contour  will 
always  have  a  similar  singularity  (semi-cubical  cusp)  at  some  point  close  to 
the  singularity  of  the  apparent  contour  of  the  original  surface.  In  other  words, 
this  singularity  is  not  removable  by  a  small  perturbation  of  the  surface. 

Furthermore,  in  place  of  a  deformation  of  the  surface,  we  can  arbitrarily 
deform  the  mapping  itself  of  the  surface  to  the  plane  (no  longer  caring 
whether  it  is  a  projection),  as  long  as  it  remains  smooth  and  the  deformation 
is  small.  It  turns  out  that,  for  these  deformations  too,  the  cusp  does  not  dis¬ 
appear  but  is  only  slightly  deformed. 

The  examples  presented  here  exhaust  all  typical  singularities  of  mappings 
of  a  surface  to  the  plane.  It  can  be  shown  that  all  more  complicated  singu- 
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larities  are  removable  by  a  small  perturbation.  Therefore,  by  slightly  de¬ 
forming  any  smooth  mapping,  we  can  always  arrange  that  in  a  neighborhood 
of  any  point  of  the  surface,  the  mapping  will  be  either  nonsingular,  or 
structurally  similar  to  the  projection  mapping  of  a  sphere  onto  a  plane  near 
the  equator,  or  structurally  similar  to  the  projection  mapping  of  the  surface 
considered  above  with  a  cubic  cusp  on  the  apparent  contour. 

The  words  “structurally  similar  to”  mean  that,  on  the  pre-image  surface 
and  the  image  plane,  we  can  choose  local  coordinates  (in  a  neighborhood  of 
our  point  and  its  image)  such  that  in  these  coordinates  the  mapping  will  be 
written  in  a  special  way.  Namely,  the  normal  forms  to  which  the  mapping 
of  the  surface  to  the  plane  will  be  reduced  in  a  neighborhood  of  points  of  the 
three  types  indicated  above  will  be 

.Vi  =  y2  —  x2  (nonsingular  point) 

Ti  =  x\  y2  =  x2  (a  fold,  as  on  the  equator  of  the  sphere) 

Jh  =  x1x2  —  Xj  y2  —  x2  (a  “tuck”  with  a  cusp  on  the  apparent 

contour) 

Here  (x1?  x2)  are  the  local  coordinates  in  the  pre-image,  and  (yl5  y2)  are  the 
local  coordinates  in  the  image. 

The  proof  of  this  theorem  (it  is  due  to  H.  Whitney)  and  its  multidimen¬ 
sional  generalizations  can  be  found  in  works  on  the  theory  of  singularities  of 
smooth  maps,  such  as 

V.  I.  Arnold,  Singularities  of  smooth  mappings,  Russian  Math.  Surveys  23: 1  (1968)  1-44. 

Symposium  on  Singularities  of  Smooth  Manifolds  and  Maps,  Univ.  of  Liverpool,  1969-70. 
Proceedings.  Springer,  1971.  See  especially  the  article  of  R.  Thom  and  H.  Levine. 

Golubitsky  and  Guillemin,  Stable  Mappings  and  Their  Singularities,  Springer-Verlag,  1973. 
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B  Singularities  of  projection  of  lagrangian  manifolds 
We  now  consider  an  n-dimensional  configuration  manifold,  the  correspond¬ 
ing  2n-dimensional  phase  space,  and  an  n-dimensional  lagrangian  submani¬ 
fold  (i.e.,  an  n-dimensional  submanifold  on  which  the  2-form  giving  the 
symplectic  structure  of  the  phase  space  is  identically  zero). 

By  projecting  the  lagrangian  manifold  onto  the  configuration  space,  we 
obtain  a  mapping  of  one  smooth  n-dimensional  manifold  to  another.  At  most 
points,  this  mapping  is  a  local  dififeomorphism,  but  at  some  points  of  the 
lagrangian  manifold  the  rank  of  the  differential  drops.  These  points  are  said 
to  be  singular.  Under  projection  of  the  set  of  singular  points  to  the  configura¬ 
tion  space  an  “apparent  contour”  is  formed,  which  is  called  a  caustic  in  the 
lagrangian  case. 

Caustics  can  have  complicated  singularities;  however,  as  in  the  usual 
theory  of  singularities  of  smooth  maps,  we  can  get  rid  of  singularities  which 
are  too  complicated  by  a  small  perturbation  (here,  by  a  small  perturbation, 
we  mean  a  small  deformation  of  a  lagrangian  manifold  in  phase  space  under 
which  this  manifold  remains  lagrangian). 

After  this  there  remain  only  the  simplest  unremovable  singularities,  for 
which  we  can  write  out  normal  forms  and  which  we  can  study  once  and  for  all. 
When  considering  problems  in  general  position  which  do  not  satisfy  any 
special  properties  of  symmetry,  it  is  natural  to  expect  that  only  these  simple 
unremovable  singularities  will  appear. 

Consider,  for  example,  the  caustics  formed  on  a  wall  by  light  from  a  point 
source  reflected  from  some  smooth  curved  surface  (here  the  four-dimensional 
phase  space  is  formed  by  straight  lines  intersecting  the  surface  of  the  wall  in 
all  possible  directions,  and  the  lagrangian  submanifold  by  the  rays  of  light 
coming  from  the  source  as  they  intersect  the  wall).  By  moving  the  source,  we 
can  see  that  generally  the  caustics  have  only  simple  singularities  (semi- 
cubical  cusps),  while  more  complicated  singularities  appear  only  for  special, 
exceptional  positions  of  the  source. 

We  will  give  below,  for  n  <  5,  normal  forms  for  singularities  of  the  pro¬ 
jection  of  an  n-dimensional  lagrangian  submanifold  of  2n-dimensional  phase 
space  onto  an  n-dimensional  configuration  space.  There  are  a  finite  number 
of  these  normal  forms,  and  their  classification  is  related  (in  a  rather  mysteri¬ 
ous  way)  with  the  classifications  of  simple  Lie  groups,  simple  degenerate 
critical  points  of  functions,  regular  polyhedra,  and  many  other  objects.  For 
n  >  6,  the  normal  forms  of  some  singularities  must  inevitably  contain 
parameters.  For  further  details  the  reader  is  referred  to  the  articles: 


V.  I.  Arnold,  Normal  forms  for  functions  near  degenerate  critical  points,  the  Weyl  groups  of 
Ak ,  Dk ,  Ek,  and  lagrangian  singularities,  Functional  Analysis  and  Its  Applications  6:4(1972) 
254-272. 

V.  I.  Arnold,  Critical  points  of  smooth  functions  and  their  normal  forms,  Uspekhi  Math  Nauk 
30:5  (1975). 
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C  Tables  of  normal  forms  of  typical  singularities 
of  projections  of  lagrangian  manifolds  of 
dimension  n  <  5 


We  will  use  the  following  notation: 

(ql, . . . ,  q„)  are  coordinates  on  the  configuration  space, 

(pl5 . . . ,  p„)  are  the  corresponding  impulses, 

so  that  p  and  q  together  form  a  symplectic  coordinate  system  in  the  phase 
space. 

We  will  give  a  lagrangian  manifold  with  the  help  of  a  generating  function 
F  by  the  formulas 

dF  dF 


where  the  index  i  runs  over  some  subset  of  {1, . . . ,  n}  and  j  runs  over  the  re¬ 
mainder  of  {1, . . . ,  n}.  That  is,  i  —  1  ,j  >  1  for  singularities  denoted  in  the  list 
by  Ak,  and  i  =  1,  2,j  >  2  for  singularities  denoted  by  Dk  and  Ek. 

With  this  notation,  one  and  the  same  expression  F(ph  qj)  can  be  con¬ 
sidered  as  giving  a  lagrangian  manifold  in  spaces  of  a  different  number  of 
dimensions:  we  can  add  arbitrarily  many  arguments  qjt  on  which  F  does  not 
actually  depend. 

The  list  of  normal  forms  of  typical  singularities  is  now  as  follows:  for 
n  -  1 


Ai'.F  =  p\  A2:  F  —  ±Pi ; 


for  w  —  2,  in  addition  to  the  two  above,  there  is 


A3-  f  =  ±pt  + 


for  n  =  3,  in  addition  to  the  three  preceding,  there  are 

A4:  F  =  ±pf  4-  q2pj  4-  q2p\, 

Da:  F  =  ±pjp2  ±pl+  q3ph 
for  n  —  4,  in  addition  to  the  five  preceding,  there  are 

=  ±P6i  +  Q4.Pt  +  <hPl  +  QiPu 
D5:F  =  ±p\p2  ±  pi  +  q4pl  +  q3pl; 
for  n  =  5,  in  addition  to  the  seven  preceding,  there  are 
A6:F  =  ±p ]  ±  q5p{  +  •  •  *  +  q2p\ , 

D6:F  =  ±pjp2  +  p52  +  q5pt  +  q4pl  T  q3pl, 
F6-T  =  ±Pl  ±  Pi  +  qSPlP2  +  Q4PlP2  +  QaP l 
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D  Discussion  of  the  normal  forms 

A  point  of  type  A  { is  nonsingular.  A  singularity  of  type  A  2  is  a  fold  singularity. 
If  we  take  (pl5  q2, . . . ,  qn)  as  coordinates  on  the  lagrangian  manifold,  then 
the  projection  mapping  may  be  written  as 

(Pi,  q2,  . . . ,  q„ )  -►  (±3  pi,  q2,...,  qn ). 

A  singularity  of  type  A3  is  a  tuck  with  a  semi-cubical  cusp  on  the  visible 
contour.  To  convince  ourselves  of  this,  it  is  enough  to  write  out  the  cor¬ 
responding  mapping  of  the  two-dimensional  lagrangian  manifold  to  the 
plane: 

(Pi,q2)->(±4Pi  + 

A  singularity  of  type  A4  first  appears  in  the  three-dimensional  case,  and 
the  corresponding  caustic  is  represented  by  a  surface  in  three-dimensional 
space  (Figure  246)  with  a  singularity  called  a  swallowtail  (we  already  en¬ 
countered  this  in  Section  46). 

The  caustic  of  a  singularity  of  type  D4  in  three-dimensional  space  is 
represented  as  a  surface  with  three  cuspidal  edges  (of  type  A3),  tangent  at 
one  point;  two  of  these  cuspidal  edges  can  be  imaginary,  so  that  there  are 
two  versions  of  the  caustic  of  D4. 


Figure  246  Typical  singularities  of  caustics  in  three-dimensional  space 


E  Lagrangian  equivalence 

We  must  now  say  in  what  sense  the  examples  mentioned  are  normal  forms  of 
typical  singularities  of  projections  of  lagrangian  manifolds.  First  of  all,  we 
will  define  which  singularities  we  will  consider  to  have  the  “same  structure.” 

A  projection  mapping  of  a  lagrangian  manifold  onto  configuration  space 
will  be  called  a  lagrangian  mapping  for  short.  Suppose  that  we  are  given  two 
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lagrangian  mappings  of  manifolds  of  the  same  dimension  n  (the  correspond¬ 
ing  n-dimensional  lagrangian  manifolds  lie,  in  general,  in  different  phase 
spaces  which  are  cotangent  bundles  of  two  different  configuration  spaces).  We 
say  that  two  such  lagrangian  mappings  are  lagrangian  equivalent  if  there  is  a 
symplectic  diffeomorphism  of  the  first  phase  space  to  the  second,  taking 
fibers  of  the  first  cotangent  bundle  to  fibers  of  the  second,  and  taking  the  first 
lagrangian  manifold  to  the  second.  The  symplectic  diffeomorphism  itself  is 
then  called  a  lagrangian  equivalence  mapping. 

We  note  that  two  lagrangian  equivalent  lagrangian  mappings  are  taken 
one  to  the  other  with  the  help  of  diffeomorphisms  in  the  pre-image  space  and 
the  image  space  (or,  as  they  say  in  analysis,  are  carried  to  one  another  by  a 
change  of  coordinates  in  the  pre-image  and  in  the  image).  In  fact,  our  sym¬ 
plectic  diffeomorphism  restricted  to  the  lagrangian  manifold  gives  a  diffeo¬ 
morphism  of  the  pre-images;  a  diffeomorphism  of  the  configuration-space 
images  arises  because  fibers  are  carried  to  fibers. 

In  particular,  the  caustics  of  the  two  lagrangian  equivalent  mappings  are 
diffeomorphic,  hence  a  classification  up  to  lagrangian  equivalence  implies  a 
classification  of  caustics.  However,  the  classification  up  to  lagrangian  equiv¬ 
alence  is  finer  than  the  classification  of  caustics,  since  a  diffeomorphism  of 
caustics  does  not  in  general  give  rise  to  a  lagrangian  equivalence  of  the  map¬ 
pings.  Furthermore,  the  classification  up  to  lagrangian  equivalence  is  finer 
then  the  classification  up  to  diffeomorphisms  of  the  pre-image  and  image, 
since  not  every  such  pair  of  diffeomorphisms  is  realized  by  a  symplectic 
diffeomorphism  of  the  phase  space. 

A  lagrangian  mapping  considered  in  a  neighborhood  of  some  chosen  point 
is  called  lagrangian  equivalent  at  that  point  to  another  lagrangian  mapping 
(also  with  a  chosen  point),  if  there  is  a  lagrangian  equivalence  of  the  first 
mapping  in  some  neighborhood  of  the  first  point  onto  the  second  in  some 
neighborhood  of  the  second  point,  carrying  the  first  point  to  the  second. 

We  can  now  formulate  a  classification  theorem  for  singularities  of 
lagrangian  mappings  in  dimensions  n  <  5. 


Theorem.  Every  n- dimensional  lagrangian  manifold  (n  <  5)  can,  by  an  arbi¬ 
trarily  small  perturbation  in  the  class  of  lagrangian  manifolds,  be  made  into 
one  such  that  the  projection  mapping  onto  the  configuration  space  will  be 
lagrangian  equivalent  at  every  point  to  one  of  the  lagrangian  mappings  in 
the  list  above. 


In  particular,  a  two-dimensional  lagrangian  manifold  can  be  put  in 
“general  position”  by  an  arbitrarily  small  perturbation  in  the  class  of 
lagrangian  manifolds,  so  that  the  projection  mapping  onto  the  configuration 
space  (two-dimensional)  will  not  have  singularities  other  than  folds  (which 
can  be  reduced  by  a  lagrangian  equivalence  to  the  normal  form  A2 )  or  tucks 
(which  can  be  reduced  by  a  lagrangian  equivalence  to  the  normal  form  A3). 
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We  note  that  this  assertion  about  two-dimensional  lagrangian  mappings  does  not  follow 
from  the  classification  theorem  for  general  (non-lagrangian)  mappings.  In  the  first  place, 
lagrangian  mappings  make  up  a  very  restricted  class  among  all  smooth  mappings,  and  therefore 
they  can  (and  actually  do  for  n  >  2)  have  as  typical,  singularities  which  are  not  typical  for 
mappings  of  general  form.  Secondly,  the  possibility  of  reducing  a  mapping  to  normal  form  by 
diflfeomorphisms  of  the  pre-image  and  image  does  not  imply  that  this  can  be  done  using  a 
lagrangian  equivalence. 


In  this  way,  the  caustics  of  a  two-dimensional  lagrangian  manifold  in 
general  position  have  as  singularities  only  semi-cubical  cusps  (and  points  of 
transversal  intersection).  All  more  complicated  singularities  break  up  under 
a  small  perturbation  of  the  lagrangian  manifold,  the  resulting  cusps  and  self¬ 
intersection  points  of  caustics  are  unremovable  by  small  perturbations,  and 
are  only  slightly  deformed. 

Normal  forms  of  the  singularities  A4,  D4, . . .  can  be  used  in  a  similar  way 
for  studying  the  caustics  of  lagrangian  manifolds  of  higher  dimensions,  and 
also  for  studying  the  development  of  caustics  of  low-dimensional  lagrangian 
manifolds,  when  parameters  on  which  the  manifold  depends  are  varied.116 


Other  applications  of  the  formulas  of  this  section  can  be  found  in  the  theory  of  Legendre 
singularities,  i.e.,  singularities  of  wave  fronts.  Legendre  transforms,  envelopes,  and  convex  hulls 
(cf.  Appendix  4).  The  theories  of  lagrangian  and  Legendre  singularities  have  direct  application, 
not  only  in  geometric  optics  and  the  theory  of  asymptotics  of  oscillating  integrals,  but  also  in 
the  calculus  of  variations,  in  the  theory  of  discontinuous  solutions  of  nonlinear  partial  differential 
equations,  in  optimization  problems,  pursuit  problems,  etc.  R.  Thom  has  suggested  the  general 
name  catastrophe  theory  for  the  theory  of  singularities,  the  theory  of  bifurcations,  and  their 
applications. 


116  See,  e  g.,  V.  Arnold,  Evolution  of  wavefronts  and  equivariant  Morse  lemma.  Comm.  Pure 
Appl.  Math.,  1976,  No.  6. 
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Not  all  first  integrals  of  equations  in  classical  mechanics  are  explained  by 
obvious  symmetries  of  a  problem  (examples  are  specific  integrals  of  Kepler’s 
problem,  the  problem  of  geodesics  on  an  ellipsoid,  etc.).  In  such  cases,  we 
speak  of  “hidden  symmetry.”1 17 

Interesting  examples  of  such  hidden  symmetry  are  furnished  by  the 
Korteweg-de  Vries  equation 

0)  u,  =  ~  uxxx. 

This  nonlinear  partial  differential  equation  first  arose  in  the  theory  of 
waves  in  shallow  water;  later  it  turned  out  that  this  equation  is  encountered 
in  a  whole  series  of  problems  in  mathematical  physics. 

As  a  result  of  a  series  of  numerical  experiments,  remarkable  properties 
of  solutions  of  this  equation  with  zero  boundary  conditions  at  infinity  were 
discovered:  as  t  — ►  go  and  t  -*■  —  oo  these  solutions  decompose  into  “soli- 
tons” — waves  of  definite  form  moving  with  different  velocities. 

To  obtain  a  soliton  moving  with  velocity  c,  it  is  sufficient  to  substitute  the  function 
u  —  <p(.x  —  ct )  into  equation  (1).  Then  we  obtain  the  equation  <p"  =  3< p2  +  c<p  +  d  for  (p 
(d  is  a  parameter).  This  is  Newton’s  equation  with  a  cubic  potential.  There  is  a  saddle  on  the 
phase  space  (cp,  cp').  The  separatrix  going  from  this  saddle  to  the  saddle  for  which  <p  =  0  de¬ 
termines  a  solution  <p  tending  to  0  as  x  -*■  ±  x  ;  it  is  a  soliton. 

When  solitons  collide,  there  is  a  complicated  nonlinear  interaction. 
However,  numerical  experiments  showed  that  the  sizes  and  velocities  of  the 
solitons  do  not  change  as  a  result  of  collision.  And,  in  fact,  Kruskal,  Zabusky, 
Lax,  Gardner,  Green,  and  Miura  succeeded  in  finding  a  whole  series  of  first 
integrals  for  the  Korteweg-de  Vries  equation.  These  integrals  have  the  form 
Is  —  J  Ts(w, . . . ,  u(s))dx,  where  Ps  is  a  polynomial.  For  example,  it  is  easy  to 
verify  that  the  following  are  first  integrals  of  equation  (1): 


The  appearance  of  an  infinite  series  of  first  integrals  is  easily  explained  by 
the  following  theorem  of  Lax.1 18  We  will  denote  the  operator  of  multiplica¬ 
tion  by  a  function  of  x  by  the  symbol  for  the  function  itself,  and  the  operator 
of  differentiation  with  respect  to  x  by  the  symbol  d.  Consider  the  Sturm- 
Liouville  operator  L  —  —d2  +  u  depending  on  a  function  u(x).  We  verify 
directly: 

Theorem.  The  Korteweg-de  Vries  equation  (1)  is  equivalent  to  the  equation 
u  =  [L,  A],  where  A  =  4  d3  —  3 (u  d  +  8u). 

111  The  term  “accidental  symmetry”  is  frequently  used  in  English.  [Trans,  note.] 

118  Lax,  P.  D.,  Integrals  of  nonlinear  equations  of  evolution  and  solitary  waves.  Comm.  Pure 
Appl.  Math.  21  (1968)  467-490. 
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Directly  from  this  theorem  of  Lax,  we  have 

Corollary.  The  operators  L  constructed  from  a  solution  of  equation  (1)  an 

unitarily  equivalent  for  all  t;  in  particular,  each  of  the  eigenvalues  X  of  the 

Sturm- Lionville  problem  Lf  =  Xf  with  zero  boundary  conditions  at  infinity 

is  a  first  integral  of  the  Korteweg-de  Vries  equation. 

Gardner,  V.  E.  Zakharov  and  L.  D.  Faddeev  noted  that  equation  (1)  is  a 
completely  integrable  infinite-dimensional  hamiltonian  system,  and  found 
the  corresponding  action-angle  variables.1 19  A  symplectic  structure  on  the 
space  of  functions  vanishing  at  infinity  is  given  by  the  skew-scalar  product 
co2(dw,  dv)  =  j  J  (w  dv  -  v  dw)dx,  and  the  hamiltonian  of  equation  (1)  is  the 
integral  /,.  In  other  words,  equation  (1)  can  be  written  in  the  form  of  Hamil¬ 
ton’s  equation  in  the  functional  space  of  functions  of  x,  u  =  (d/dx)(5I  fdu). 

Every  integral  Is  gives  in  this  way  a  “higher  Korteweg-de  Vries  equation” 
u  =  Qs[u],  where  Qs  =  ( d/dx)(SIs/5u )  is  a  polynomial  in  the  derivatives 
u,u',..  .,u2s+i.  The  integrals  Is  are  in  involution,  and  the  flows  corresponding 
to  them  on  the  functional  space  commute. 

The  explicit  form  of  the  polynomials  Ps  and  Q s,  and  also  the  explicit  form  of  the  action- 
angle  variables  (and  therefore  of  solutions  of  equation  (1)),  is  described  in  terms  of  solutions  of 
the  direct  and  inverse  problems  of  scattering  theory  with  potential  u. 

The  explicit  form  of  the  polynomials  Qs  can  also  be  obtained  from  the  following  theorem  of 
Gardner,  generalizing  Lax's  theorem.  In  the  space  of  functions  of  x.  we  consider  a  differential 
operator  of  the  form  A  =  £  where  p0  =  1,  and  the  remaining  coefficients  p,  are  poly¬ 

nomials  in  u  and  the  derivatives  of  ti  with  respect  to  x.  It  turns  out  that,  for  any  s  there  is 
an  operator  As  of  order  2s  +  1  such  that  its  commutator  with  the  Sturm-Liouville  operator  L 
is  the  operator  of  multiplication  by  a  function  [L,  As]  =  Qs. 

The  operator  As  is  defined  by  these  conditions  uniquely  up  to  the  addition  of  linear  combina¬ 
tions  of  the  Ar  with  r  <  s;  in  the  same  way,  the  polynomials  Qs  are  determined  up  to  the  addition 
of  linear  combinations  of  the  preceding  (9r’s. 

V.  E.  Zakharov,  A.  B.  Shabat,  L.  D.  Faddeev,  and  others,  using  Lax’s 
method  and  techniques  of  inverse  scattering  theory,  have  studied  a  whole 
series  of  physically  important  equations,  including  the  equations  utt  -  uxx  = 
sin  u  and  i\pt  +  i(/xx  ±  iAI*AI2  =  0- 

Investigation  of  the  problem  with  periodic  boundary  conditions  for  the 
Korteweg-de  Vries  equation  led  S.  P.  Novikov120  to  the  discovery  of  an 
interesting  class  of  completely  integrable  systems  with  a  finite  number  of 
degrees  of  freedom.  These  systems  are  constructed  in  the  following  way. 

Consider  any  finite  linear  combination  of  first  integrals,  /  = 
and  let  c0  =  1.  The  set  of  stationary  points  of  the  flow  with  hamiltonian  / 

1 19  Zakharov,  V.  E.  and  Faddeev,  L.  D.,  The  Korteweg-de  Vries  equation  is  a  completely 
integrable  hamiltonian  system.  Functional  Analysis  and  Its  Applications,  5:4(1971)  280-287. 

120  Novikov,  S.  P.,  The  periodic  problem  for  the  Korteweg-de  Vries  equation.  Functional 
Analysis  and  Its  Applications,  8:3  (1974)  236-246. 
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on  the  functional  space  is  invariant  under  the  phase  flows  with  hamiltonians 
ls,  including  the  phase  flow  of  equation  (1). 

On  the  other  hand,  these  stationary  points  are  determined  from  the 
equations  ( d/dx)(dl/du )  =  0,  or  dljdu  =  d.  The  second  equation  is  the 
Euler- Lagrange  equation  for  the  functional  I  -  dl_u  involving  derivatives 
of  order  n.  Therefore,  it  has  order  2 n  and  can  be  written  as  a  hamiltonian 
system  of  equations  in  2n-dimensional  euclidean  space. 

It  turns  out  that  this  hamiltonian  system  with  n  degrees  of  freedom  has  n 
integrals  in  involution  and  can  be  integrated  completely  with  the  help  of 
suitable  action-angle  coordinates.  In  this  way,  we  obtain  a  finite-dimensional 
family  of  particular  solutions  of  the  Korteweg-de  Vries  equation  depending 
on  3n  +  1  parameters  (2 n  phase  coordinates  and  n  +  1  further  parameters 
Ci, ... ,  c„;  d). 

These  solutions  have,  as  Novikov  showed,  remarkable  properties;  for 
example,  in  the  periodic  problem  they  give  functions  u(x )  for  which  the  linear 
differential  equation  with  periodic  coefficients 

-X"  +  u(x)X  =  XX 

has  a  finite  number  of  zones  of  parametric  resonance  (cf.  Section  25)  on  the 
A-axis. 

After  this  book  was  written,  much  work  was  done  on  the  subjects  dis¬ 
cussed  in  this  appendix,  in  particular  by  Novikov,  Doubrovin,  Krichever, 
Manakov,  Matveev,  Its,  Dikii,  Manin,  Drinfeld,  Gelfand,  Lax,  Moser, 
McKean,  Van  Moerbeke,  Adler,  Perelomov,  Olshanetskii,  and  many  others. 
Among  other  things,  Manakov  solved  the  Euler  equations  of  a  rigid  body  in 
R"  for  arbitrary  n ;  these  are  completely  integrable.  For  more  details  see  the 
forthcoming  book  by  Novikov  and  his  collaborators.  (Note  added  by  author 
in  translation.) 
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Along  with  the  classical  Poisson  bracket  of  functions,  one  also  encounters 
more  general  (degenerate)  brackets.  A  typical  example  is  the  Poisson  bracket 
of  functions  of  the  components  Mf  of  the  angular  momentum  vector: 
{F  g}  =  '£i(dF/dMi)(dG/dMj){Mi,  Mj}.  Such  degenerate  brackets  may  be 
considered  as  families  of  ordinary  Poisson  brackets  or  families  of  sympletic 
manifolds.  These  families  generally  have  singularities  (they  are  not  foliations): 
they  consist  of  symplectic  manifolds  (leaves)  of  different  dimensions,  related 
to  one  another  by  the  condition  of  smoothness  for  the  given  degenerate 
Poisson  bracket  structure  on  the  ambient  space.  (In  the  angular  momentum 
example  above,  the  leaves  are  concentric  spheres  and  their  center  at  the 
origin.) 

In  this  appendix,  we  shall  present  the  simplest  elementary  properties  ol 
Poisson  structures  on  finite-dimensional  manifolds.  One  should  keep  in  mind, 
though,  that  in  applications  (especially  to  the  mathematical  physics  of  con¬ 
tinuous  media)  one  frequently  encounters  Poisson  structures  on  infinite¬ 
dimensional  manifolds.  In  these  cases,  the  symplectic  leaves  often  (but  not 
always)  have  finite  dimension  or  codimension. 


A  Poisson  manifolds 

A  Poisson  structure  on  a  manifold  is  a  Lie  algebra  structure  on  its  space 
of  smooth  functions  (i.e.,  a  bilinear  skew-symmetric  operation  of  “Poisson 
bracket”  on  functions,  satisfying  the  Jacobi  identity)  such  that  the  operator 
ada  =  }  (contraction  of  the  Poisson  bracket  with  any  fixed  function  a)  is 

an  operator  of  differentiation  by  some  vector  field  6a.  The  vector  field  da  is  then 
called  the  hamiltonian  vector  field  with  hamiltonian  function  a.  The  mapping 
ai-yQa  gives  a  homomorphism  from  the  Lie  algebra  of  functions  to  the  Lie 
algebra  of  vector  fields.  A  manifold  with  a  given  Poisson  structure  is  called  a 
Poisson  manifold. 

Two  points  on  a  Poisson  manifold  are  called  equivalent  if  they  can  be  joined 
by  a  path  consisting  of  segments  of  integral  curves  of  hamiltonian  vector  fields. 
The  equivalence  classes  under  this  relation  are  called  the  leaves  of  the  Poisson 
manifold.  The  values  of  all  possible  hamiltonian  vector  fields  at  a  given  point 
of  a  Poisson  manifold  form  a  linear  space  which  is  just  the  tangent  space  of 
the  leaf  through  that  point.  Thus  the  leaves  are  smooth  manifolds,  but  they 
are  in  general  not  closed,  and  they  have  different  dimensions. 

The  classical  (explicitly  described  by  S.  Lie  in  1890,  but  essentially  con¬ 
sidered  already  by  Jacobi)  example  of  a  Poisson  manifold  is  the  dual  space  of 
a  (finite-dimensional)  Lie  algebra.  The  elements  of  the  algebra  itself  may  be 
considered  as  linear  functions  on  this  space.  The  Poisson  structure  is  defined 
as  an  extension  of  the  Lie  algebra  structure  from  this  finite-dimensional  sub¬ 
space  to  the  entire  space  of  smooth  functions  on  the  dual  of  the  original  Lie 
algebra.  Such  an  extension  exists  and  is  unique:  if  con  is  a  basis  of  the 
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original  Lie  algebra,  then 

[a,  b}Poisson  ^ (dci/do);) (db/dojj) f QJ, ,  ie- 

In  this  example,  the  leaves  are  the  orbits  of  the  co-adjoint  representation  of 
the  underlying  Lie  group  in  the  dual  of  its  Lie  algebra. 

Every  leaf  of  a  Poisson  manifold  carries  a  natural  symplectic  structure 
(closed  nondegenerate  2-form),  defined  in  the  following  way.  Consider  the 
values  of  two  hamiltonian  vector  fields  at  a  point  of  the  leaf.  The  value  of  the 
2-form  on  this  pair  of  vectors  is  defined  to  be  the  value  of  the  Poisson  bracket 
of  the  hamiltonian  functions  at  the  given  point  (this  value  depends  only  on  the 
two  vectors  and  not  on  the  choice  of  hamiltonian  functions).  The  fact  that  the 
form  is  closed  on  the  leaf  follows  from  the  Jacobi  identity;  nondegeneracy 
comes  from  the  fact  that,  if  the  derivative  of  every  function  by  a  given  tangent 
vector  is  zero,  then  the  vector  itself  must  be  zero.  The  phase  flow  of  every 
hamiltonian  vector  field  preserves  the  symplectic  structures  on  the  leaves. 

Thus,  the  leaves  of  a  Poisson  manifold  are  even  dimensional,  and  the 
manifold  may  be  considered  as  a  union  of  sympletic  manifolds  (generally  of 
different  dimensions),  whose  symplectic  structures  are  coordinated  by  the 
condition  that  the  Poisson  bracket  on  the  ambient  space  be  smooth. 

For  example,  the  co-adjoint  orbits  of  SO(3)  (spheres  centered  at  the  origin) 
may  be  organized  according  to  local  Darboux  coordinates:  in  the  neighbor¬ 
hood  of  any  nonzero  point,  the  Poisson  structure  in  suitable  local  coordinates 
takes  the  form  {x,y}  =  1  ,{x,z}  =  {y,z}  =  0.  This  normal  form  for  the  Poisson 
structure  on  the  space  of  angular  momenta  is  convenient  in  carrying  out  the 
process  of  elimination  of  the  nodes  in  the  many-body  problem  (see  Section 
III. 5.5  of  the  paper:  V.  I.  Arnol’d,  Small  denominators  and  problems  of 
stability  of  motion  in  classical  and  celestial  mechanics,  Russian  Math  Surveys 
18,  No.  6(1963),  85-191). 

Jacobi  realized  that  the  (classical)  Poisson  brackets  of  the  first  integrals 
of  any  hamiltonian  system  could  be  considered  as  a  Poisson  structure  (this 
structure  is  discussed  in  Section  VI.1.3  of  the  author’s  paper  cited  above). 

The  construction  of  a  Poisson  structure  on  the  dual  space  of  a  Lie  algebra 
leads  to  a  new  Lie  algebra.  This  construction  may  then  be  repeated,  leading 
to  a  whole  series  of  new  (infinite-dimensional)  Poisson  structures.  More  gen¬ 
erally,  suppose  that  one  is  given  any  Poisson  structure  on  a  manifold.  Then 
the  space  of  functions  on  that  manifold  carries  the  structure  of  a  Lie  algebra. 
This  implies  that  the  dual  space  of  this  function  space  carries  its  own  Poisson 
structure.  Elements  of  this  dual  space  may  be  interpreted  as  distribution  den¬ 
sities  on  the  original  manifold.  Thus,  the  space  of  distributions  on  a  Poisson 
manifold  (for  example,  on  a  symplectic  phase  space)  has  a  natural  Poisson 
structure.  This  structure  makes  it  possible  to  apply  the  hamiltonian  formalism 
to  equations  of  Vlasov  type,  which  describe  the  evolution  of  distributions  of 
particles  in  phase  space  under  the  action  of  a  field  which  is  consistent  with  the 
particles  themselves. 
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B  Poisson  mappings 

A  mapping  from  one  Poisson  manifold  to  another  is  called  a  Poisson  mapping 
if  it  is  consistent  with  the  Poisson  structures,  i.e.,  if  for  any  two  functions 
on  the  second  manifold,  the  Poisson  bracket  of  their  pullbacks  to  the  first 
manifold  coincides  with  the  pullback  of  their  Poisson  brackets.  For  example, 
the  embedding  of  each  symplectic  leaf  in  a  Poisson  manifold  is  a  Poisson 
mapping. 

The  cartesian  product  of  two  Poisson  manifolds  has  a  natural  Poisson 
structure,  for  which  the  projection  on  each  factor  is  a  Poisson  mapping  (the 
Poisson  bracket  of  functions  pulled  back  from  different  factors  is  zero). 

S.  Lie  showed  that  every  Poisson  manifold  is  locally  (in  the  neighborhood 
of  a  point  where  the  dimension  of  the  symplectic  leaves  is  locally  constant,  for 
example,  in  the  neighborhood  of  a  generic  point,  where  the  rank  is  locally 
maximal)  decomposible  into  the  product  of  a  symplectic  leaf  and  a  comple¬ 
mentary  space  on  which  all  Poisson  brackets  are  zero. 

On  such  a  neighborhood,  one  may  introduce  coordinates  ph  qh  ct  such  that 
p  and  q  have  the  usual  symplectic  Poisson  brackets,  while  the  Poisson  bracket 
of  each  c{  with  any  function  is  equal  to  zero.  In  physics,  the  coordinates  pt  and 
q.  are  called  Clebsch  variables,121  while  the  c-s  are  called  Casimir  functions. 
Clebsch  introduced  his  variables  for  the  hamiltonian  description  of  the  hydro¬ 
dynamics  of  ideal  fluids,  while  Casimir  considered  the  center  of  the  Lie  algebra 
of  functions  on  the  dual  space  of  a  given  Lie  algebra. 

The  dimension  of  the  symplectic  leaf  through  a  nongeneric  point  of  a 
Poisson  manifold  is  less  than  that  for  nearby  generic  points.  In  the  neighbor¬ 
hood  of  such  a  point,  the  Poisson  manifold  may  still  be  represented  as  the 
product  of  a  neighborhood  of  the  point  in  its  symplectic  leaf  and  a  neighbor¬ 
hood  of  a  distinguished  point  in  some  Poisson  manifold  of  complementary 
dimension.  In  other  words,  on  a  minimal  transverse  manifold  to  a  symplectic 
leaf  there  arises  a  (unique  up  to  diffeomorphism)  local  Poisson  structure  the 
so-called  transverse  Poisson  structure  (cf.  A.  Weinstein,  The  local  structure  of 
Poisson  manifolds,  J.  Diff.  Geom.  18  (1983),  523-557).122  In  the  transverse 
structure,  the  Poisson  brackets  of  all  functions  are  zero  at  the  distinguished 
point  (which  may  be  taken  as  the  origin  of  a  coordinate  system).  The  Taylor 
series  for  these  brackets  begin  with 

{x,-,  Xj }  —  jXfr  +  , 


121  Translator’s  note:  The  term  Clebsch  variables  is  also  used  to  refer  to  canonical  coordinates 
on  a  symplectic  manifold  which  projects  onto  (rather  than  embedding  into)  a  Poisson  manifold. 

122  Warning:  As  A.  B.  Givental’  has  noted,  Theorem  3.1  in  this  paper  is  incorrect.  (Translator’s 
note:  For  further  discussion,  see  A.  Weinstein,  Lie  algebras  and  Poisson  structures,  Asterisque, 
hors  serie  (1985),  257-271.) 
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where  c-  j  are  the  structure  constants  of  a  finite-dimensional  Lie  algebra  (the 
linearized  transverse  structure ). 

A  natural  question  arises:  Is  it  possible  to  annihilate  the  higher  order  terms 
in  the  Taylor  series  by  a  suitable  change  of  coordinates? 

The  question  of  the  form  of  transverse  structures  was  already  raised  by  the 
author  in  Section  VI.  1.3  of  the  previously  cited  article. 

If  the  linearized  algebra  is  semisimple  and  the  Poisson  structure  is  analytic, 
then  one  can  eliminate  the  higher  order  terms  of  the  Taylor  series  by  an 
analytic  change  of  coordinates:  J.  Conn,  Linearization  of  analytic  Poisson 
structures,  Annals  of  Math.  119  (1984),  577-601.  An  analogous  result  is  true 
for  the  C00  case,  when  the  linearized  algebra  is  of  compact  type:  J.  Conn, 
Linearization  of  C°°  Poisson  structures,  Annals  of  Math.  (1985). 

A.  Weinstein,  along  with  his  earlier  proof  of  an  analogous  result  for  formal 
series,  expressed  the  conjecture  that  semisimplicity  was  a  necessary  condition 
for  the  annihilation  of  nonlinear  terms.  The  study  of  singularities  of  Poisson 
structures  in  the  plane  (or,  more  generally,  structures  with  symplectic  leaves 
of  codimension  2)  leads,  however,  to  a  different  conclusion. 

C  Poisson  structures  in  the  plane 

From  the  point  of  view  of  differential  geometry,  a  Poisson  structure  is  given 
by  a  smooth  bivector  field  on  a  manifold.  In  fact,  the  Poisson  brackets  at  each 
point  associate  a  number  to  each  pair  of  cotangent  vectors.  Therefore  they 
define  a  section  of  the  second  exterior  power  of  the  tangent  bundle,  i.e.,  a 
bivector  field. 

The  Jacobi  identity  expresses  a  sort  of  “closedness”  of  this  bivector  field. 
On  a  two-dimensional  manifold,  this  closedness  condition  is  automatically 
satisfied  everywhere,  so  that  every  smooth  bivector  field  on  the  plane  gives  a 
Poisson  structure.  This  circumstance  allows  one  to  apply  to  the  classification 
of  Poisson  structures  in  the  plane  the  usual  considerations  of  general  position 
(transversality,  etc.).  In  terms  of  coordinates  x,  y,  a  bivector  field  may  be 
expressed  in  the  form  f(dx  a  dy),  where  /  is  a  smooth  function.  The  corre¬ 
sponding  Poisson  structure  is  defined  by  the  condition 

(!)  {x,  y}  =  f(x,  y). 

A  Poisson  structure  on  the  plane  may  also  be  given  by  a  differential  2-form 
dx  a  dy/f.  This  form,  like  the  bivector  field,  is  invariantly  connected  with  the 
Poisson  structure;  however,  unlike  the  bivector  field,  it  has  pole  singularities 
along  the  curve  /  =  0.  The  leaves  in  this  case  are  the  points  of  the  curve  /  =  0 
and  the  connected  components  of  the  complement  of  this  curve  in  the  plane. 
Points  of  the  curve  /  =  0  are  called  singular  points  of  the  Poisson  structure. 
In  the  neighborhood  of  a  nonsingular  point,  any  Poisson  structure  in  the  plane 
may  be  put  into  the  normal  form  (x,  y}  =  1. 

The  following  diagram  shows  the  beginning  of  the  hierarchy  of  singularities 
of  Poisson  structures  on  the  plane  in  the  neighborhood  of  a  singular  point. 
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Each  letter  in  the  diagram  represents  a  Poisson  structure  which,  in  suitable 
local  coordinates  with  origin  at  the  singular  point  under  consideration,  can 
be  written  in  the  form  {x,  y}  =  /,  where  the  function  /  is  given  by  Table  1. 


Table  1 

A  2k 

Aa 

/12k-l 

D“,b 

u2  k 

,  .)2*  +  1 

x2  ±  y2k 

x2y  ±  y2*-1 

y  x 

-r  y 

1  +  ayk~l 

1  +  ax  +  byk 

na 

u2k  +  l 

e6 

E% 

£8 

x2y  +  y2k 

x3  +  y4 

x3  4-  xy3 

X3  +  y5 

1  +  ax 

1  +  ay 2 

Theorem.  Given  a  Poisson  structure  on  a  two-dimensional  manifold ,  it  is  either 
reducible  in  a  neighborhood  of  each  point  to  one  of  the  normal  forms  in  Table 
1,  or  it  belongs  to  a  set  of  codimension  8  in  the  space  of  Poisson  structures. 

Thus,  a  generic  Poisson  structure  may  be  reduced  in  a  neighborhood  of 
each  point  to  the  normal  form  {x,  y}  =  1  (nonsingular  point)  or  {x,  y}  =  y 
(point  of  type  A0).  In  a  generic  one-parameter  family,  one  encounters  for 
special  values  of  the  parameter  structures  of  the  type  Ax :  {x,  y}  =  b(x2  +  y2), 
b  ^  0;  in  two-parameter  families  one  finds  A2,  etc. 

Remark  1.  In  the  two-dimensional  case,  the  set  of  all  Poisson  structures 
forms  a  linear  space,  so  that  one  may  speak  of  a  generic  structure  or  family  of 
structures  (having  in  mind  a  structure  [family]  belonging  to  some  open  dense 
subset  of  the  space  of  structures  [families]).  The  problem  of  classifying  generic 
Poisson  structures  in  three  or  more  dimensions  is  not  uniquely  posed,  since 
the  set  of  all  such  structures  does  not  form  a  single  manifold  (one  may  find 
components  of  “different  dimensions,”  as  in  the  classification  of  Lie  algebras). 

Remark  2.  The  structure  {x,  y}  -  y  of  type  A0  is  the  standard  Poisson 
structure  on  the  dual  space  of  the  Lie  algebra  of  the  group  of  affine  transforma¬ 
tions  of  the  line.  This  structure  was  considered  in  1965,  in  connection  with  the 
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study  of  the  Euler  equations  for  left-invariant  metrics  on  groups  (in  this 
case— the  Lobachevskii  metric  on  a  half-plane),  at  which  time  it  was  already 
realized  that  the  structure  is  stable  and  is  locally  equivalent  to  any  structure 
of  the  form  {x,  y}  =  y  +  ■  ■  ■ ,  where  the  dots  designate  higher  order  terms.  This 
(evident)  observation  contradicts  the  previously  mentioned  conjecture  of  A. 
Weinstein,  according  to  which  the  possibility  of  removing  any  higher  order 
terms  by  a  formal  change  of  coordinates  was  characteristic  of  the  linear 
Poisson  structures  on  the  dual  spaces  of  semisimple  Lie  algebras. 

Remark  3 .  The  parameters  a,  b  in  the  table  above  are  moduli  (invariants 
depending  continuously  on  the  structure).  More  precisely,  structures  equivalent 
to  a  given  one  are  found  only  a  finite  number  of  times  as  the  parameters  are 
varied. 

The  rational  functions  in  Table  1  may  be  replaced  by  polynomials,  but  it 
is  not  very  convenient  to  do  so.  The  number  of  moduli  in  the  numerator  is 
one  less  than  the  number  of  irreducible  components  of  the  curve  /  =  0.  This  is 
not  merely  a  coincidence.  One  invariant  of  a  Poisson  structure  on  the  plane 
is  the  residue  constructed  from  the  form  dx  a  dy/f  (initially,  one  constructs 
a  residue-form  on  each  component,  then  its  residue  at  the  origin).  The  sum  of 
the  residues  corresponding  to  all  the  components  is  zero.  Therefore  the 
number  of  moduli  is  1  less  than  the  number  of  components, 

D  Powers  of  volume  forms 

The  classification  of  Poisson  structures  on  the  plane  may  be  considered  as 
the  classification  of  differential  forms  of  the  type  f(dx  a  dy)~\  where  /  is 
a  smooth  (or  holomorphic)  function.  More  generally,  it  is  natural  to  consider 
forms  of  the  type 

(2)  f(dxf  =  f(xl,...,x„)(dxl  a  a  dxnf, 

where  a  is  a  fixed  number,  generally  complex.  The  classification  of  such  forms 
and  their  deformations  in  the  one-dimensional  case,  recently  carried  out  by 
V.  P.  Kostov,  revealed  the  role  of  resonance  values  of  a  (certain  negative 
rational  numbers). 

For  example,  the  resonance  case  n=  1,  a  =  -  1  corresponds  to  the  classi¬ 
fication  of  the  singularities  and  their  bifurcations  for  vector  fields  on  the  line, 
i.e.,  singular  points  of  differential  equations  x  =  t?(x)  and  their  bifurcations  in 
finite-parameter  families.  A  generic  one-parameter  family  may  be  reduced  by 
a  smooth  (holomorphic)  change  of  the  parameter  and  a  smooth  (holomorphic) 
change  of  the  variable  x,  depending  smoothly  (holomorphically)  on  the 
parameter,  to  the  form  x  =  x2  +  e  -\-  c(s)x3.  (For  k  parameters,  the  corre¬ 
sponding  form  is  x  =  xfc+1  -t-^x*-1  +•■■  +  £*  +  c(£)x2k+1.) 

The  nonresonance  case  was  studied  by  S.  Lando  for  all  n  and  a:  he  showed 
that  almost  every  versal  deformation  of  the  function  /  defines,  after  multiplica¬ 
tion  by  {dx)a,  a  versal  deformation  of  the  form,  as  long  as  or.  is  not  a  resonance 
value. 
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The  case  a  =  —  1,  which  is  interesting  in  connection  with  Poisson  structures, 
is  generally  a  resonance  case.  Instead  of  powers  of  volume  forms,  as  in  (2), 
we  may  consider  the  differential  forms 

(3)  ff  dx,  p  =  1/a, 

whose  classification  is  obviously  equivalent. 

The  hypersurface  /  =  0  is  invariantly  connected  with  the  form  (3).  The 
classification  therefore  begins  with  the  reduction  to  normal  form  of  the  singu¬ 
larity  manifold  f  =  0.  The  beginning  of  the  hierarchy  of  singular  points  of 
hypersurfaces  is  known.  In  suitable  local  coordinates,  a  hypersurface  is  given 
by  one  of  the  equations  in  the  following  list: 


A,: 

±x?+1  ±xl±  ••• 

o' 

II 

C 

X 

+1 

h 

IV 

0 

x\x2  ±  xf-1  ±  X3 

±  •  •  •  ±  = 

0, 

p  >  4; 

Ee- 

xf  +  X 2  ±  X3  + 

•  ±  xjj  =  0; 

E 7: 

Xi  -1-  x^xl  ±  X3  ± 

■  •  ■  +  xjj  =  0; 

to 

00 

xf  +  X2  ±  X3  ±  •  ’ 

■  +  x2n  =  0. 

After  we  have  brought  the  hypersurface  into  normal  form,  the  classification 
of  the  forms  (2)  or  (3)  comes  down  to  classifying  forms  of  the  type 

(4)  /0Ji(Xi,...,x„)  dx,  h( 0)  /  0, 

where  /  =  0  is  the  given  equation  of  the  singularity  hypersurface  and  h  is 
a  smooth  (holomorphic)  function  which  remains  to  be  put  in  normal  form. 

E  The  quasi-homogeneous  case 

We  shall  consider  here  the  case  in  which  the  singularity  hypersurface  /  =  0  is 
quasi-homogeneous  (this  condition  holds  for  the  cases  A,  D,  E). 

Definition.  A  function  /  is  called  quasi-homogeneous  of  weight  p,  with  weights 
Wi  attached  to  the  variables  xh  if  it  is  an  eigenfunction  with  eigenvalue  p  for 
the  quasi-homogeneous  Euler  vector  field  e  (or  is  zero): 

ef  =  pf,  where  e  =  £  w.-x^/dx;). 

A  quasi-homogeneous  polynomial  is  called  nondegenerate  if  the  critical  point 
0  has  finite  multiplicity  (i.e.,  it  is  C  isolated).  From  here  on,  we  will  take  the 
weights  to  be  positive  numbers. 

Theorem.  Let  f  be  a  nondegenerate  quasi-homogeneous  polynomial  of  weight  1. 
Then  the  differential  form  ffih  dx  (where  dx  =  dxt  a  •  •  ■  a  dx„  and  h  is  a  holo¬ 
morphic  function  on  a  neighborhood  of  0)  may  be  reduced  by  a  biholomorphic 
coordinate  change  in  a  neighborhood  of  zero  to  the  form  fe (1  +  <f>)  dx,  v;h°re 
<f>  is  a  quasi-homogeneous  polynomial  of  weight  —  ft  —  o,  a  =  wt  +  ••  •  +  w„. 
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The  weight  of  <f>  is  chosen  so  that  the  weight  of  the  form  dx  is  zero. 

An  analogous  theorem  is  true  for  smooth  h  (and  smooth  coordinate  changes), 
except  that  in  the  real  case  one  must  replace  1  +  ^  by  ±  1  +  <f>. 

Example  1.  If  ft  is  positive,  then  <f>  =  0,  so  that  the  complex  form  reduces 
to  dx. 

More  generally,  (j>  =  0  if  the  (possibly  complex)  number  /?  is  not  a  negative 
rational  number:  in  this  case,  a  nonzero  quasi-homogeneous  polynomial  of 
weight  P  g  does  not  appear.  If  the  polynomial  /  (or  just  its  quasi¬ 
homogeneity  type  w)  is  fixed,  then  the  resonance  values  of  ft  form  a  finite  set 
of  arithmetic  progressions  in  the  negative  rationals  (for  the  remaining  ft 
ffih  dx  reduces  to  the  form  ffi  dx). 

Example  2.  If  p  =  - 1,  then  the  monomials  occurring  in  <f>  may  be  enumerated 
by  the  interior  integral  points  of  the  Newton  diagram  of  /.  The  monomial 
x  —  1 . . .  x„ n  corresponds  to  the  point  (ml  +  1, . . . ,  m„  +  1)  of  the  diagram 

(i.e.,  the  exponent  of  the  form  xm  dx). 

Example  3.  Suppose  that  p  =  - 1,  n  =  3,  and /is  one  of  the  A,  D,  E  polynomials 
introduced  above,  defining  a  simple  singularity.  Calculating  weights,  we  find 
that  —  P  —  o  <0;  therefore  (f>  =  0,  from  which  we  obtain: 


Corollary  1.  The  form  with  pole  singularity 

h(x,  y,  z)  dx  a  dy  a  dz 

f(x,  y,  z) 


h( 0)  #  0, 


where  f  is  one  of  the  polynomials  A,  D,  E,  may  be  reduced  to  the  form 
dx  a  dy  a  dzjf  by  a  holomorphic  ( smooth )  change  of  coordinates. 


In  exactly  the  same  way  for  any  n  >  3,  a  factor  h{Xl  ,...,xn)  which  does  not 
vanish  at  the  origin  can  be  converted  to  unity. 


Corollary  2.  A  simple  form  (i.e.,  one  not  having  moduli)  of  the  type  dx 1  a  ■■■  a 
dx„jf(x i , . . . ,  x„),  where  f  is  a  holomorphic  ( smooth )  function  near  the  origin 
and  n  >  2,  may  be  reduced  by  a  coordinate  change  in  a  neighborhood  of  the 
origin  to  a  normal  form  in  which  f  is  either  1  or  one  of  the  A,  D,  E  polynomials. 

Corollary  3.  A  simple  ( not  having  moduli)  n-vector  field  in  n-dimensional  space 
(n  >  2)  is  locally  equivalent  to  a  normal  form  /  -(ft  a  •  •  •  a  d„),  where  f  is 
either  1  or  one  of  the  A,  D,  E  polynomials;  dk  =  djdxk. 

Corollary  4.  For  l  <  6,  in  generic  l-parameter  families  of  n-vector  fields  on 
n-dimensional  space  (n  >  2),  the  field  in  a  neighborhood  of  each  point  and  for 
each  value  of  the  parameters  is  equivalent  to  one  of  the  simple  fields  in  the 
preceding  corollary. 
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Corollary  5.  For  l  <  6,  in  generic  l-parameter  families  of  forms  dx  a  dy  a  dz/ 
f(x,  y,  z),  one  finds  only  forms  which  in  the  neighborhood  of  each  point  are 
locally  equivalent  to  one  of  the  following  24  types : 


dx  a  dy  a  dz 
1  ’ 

dx  A  dy  A  dz 
x4  ±  y2  ±  z2  ’ 

dx  a  dy  a  dz 
x2y-ty4±z 2' 


dx  a  dy  a  dz 

9 

X 

dx  a  dy  a  dz 
x5  +  y2  ±  z2  ' 

dx  a  dy  a  dz 
x7  +  y2  ±  z2  ' 


dx  a  dy  a  dz 
x2  +  y2  ±  z2  ' 

dx  a  dy  a  dz 

7  "  1  9 

x  y  ±y*  +  z 

dx  a  dy  a  dz 
x2y  ±y5  +  z2' 


dx  a  dy  a  dz 
x3  +  y2  ±  z2  ’ 

dx  a  dy  a  dz 
x6  ±y2  ±  z2' 

dx  a  dy  a  dz 
x3  +  y4  ±  z2  ' 


For  n  =  2  and  f  =  -  1,  the  theorem  may  be  applied  in  the  following  way. 

Corollary  6.  Let  f  be  a  nondegenerate  quasi-homogeneous  polynomial  of  weight 
1  with  argument  weights  wt,  w2.  Then  the  form 

h(x,y)ix*dy  Q)  ^  0> 

fix,  y) 

where  h  is  a  smooth  ( holomorphic )  function  in  a  neighborhood  of  0,  can  be 
reduced  by  a  suitable  smooth  ( holomorphic )  coordinate  change  to  a  form  in 
which  h  =  ±1  +  <f>,  where  <j)  is  a  quasi-homogeneous  polynomial  of  weight 
1  —  wx  —  w2. 

Correspondingly,  bivector  fields  and  Poisson  structures  may  be  locally 
reduced  to  the  form 

fix,  y)jdx  a  dy)  i  fix,y) 

±  1  +  f(x,  y)  ’  ±  1  +  fix,  y) 

Calculating  the  weights  of  the  simple  singularity  types  A,  D ,  E  for  functions 
of  two  variables,  we  obtain  Table  1  from  the  last  corollary.  For  example,  for 
Ax  we  have  Wj  =  w2  =  j,  the  weight  of  f  equals  0,  and  so  f  is  constant. 

The  dimension  of  the  space  of  equivalence  classes  of  forms  h  dx  a  dyjf, 
where  h{ 0)^0  and  f  is  a  fixed  nondegenerate  quasi-homogeneous  polynomial, 
equals  the  dimension  of  the  space  of  quasi-homogeneous  polynomials  of  weight  o. 


F  Varchenko' s  theorem 

A.  N.  Varchenko  has  proven  a  series  of  generalizations  of  the  preceding 
theorem.  Here  we  shall  describe  the  simplest  of  these. 

1.  Let  /  be  a  quasi-homogeneous  polynomial  of  weight  1  in  the  variables 
Xj , . . . ,  xn  with  weights  , . . . ,  vv„.  Suppose  that,  for  some  set  /  of  multi-indices, 
the  residue  classes  of  the  monomials  x1  generate  (as  a  vector  space)  the  factor 
algebra  of  the  algebra  of  formal  power  series 

C[[xl5 . . . , xjy{df/dxx,. . . ,  df/dxa). 
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Theorem.  Every  germ  fehdx  is  equivalent  to  a  germ  of  the  form  fp(l  + 
Z  An,i*'7'/)  dx,  where  the  l's  are  nonnegative  integers  and  the  m's  are  ele¬ 
ments  of  I  such  that  the  weight  of  each  form  f0xmfl  dx  is  equal  to  zero. 

2.  We  define  the  degree  of  non-quasi-homogeneity  of  the  germ  /  to  be  the 
dimension  of  the  factor  space  (fdf/dx  t , . . . ,  dfjdxfifidfjdx^, ...,  df/dxn). 

Theorem.  For  almost  all  /?,  the  number  of  moduli  of  the  form  f0h  dx1  a  •  •  •  a  dxn 
{for  fixed  (3  and  f  and  arbitrary  h ,  h( 0)  #  0)  is  equal  to  the  degree  of 
non-quasi-homogeneity  of  the  germ  f  The  exceptional  ( resonance )  values  of 
[3  consist  of  a  finite  number  of  arithmetic  progressions  of  negative  rational 
numbers,  with  difference  —  1  .In  particular,  for  any  (3  >  0,  the  number  of 
moduli  equals  the  degree  of  non-quasi-homogeneity. 

3.  Example.  For  ft  —  0,  we  obtain: 

Corollary.  The  number  of  moduli  of  the  form  h  dx  (/i(0)  #  0),  relative  to  the 
group  of  diffeomorphisms  preserving  the  germ  of  f  equals  the  degree  of 
non-quasi-homogeneity  of  f  ( equal  to  zero,  if  the  germ  of  f  is  equivalent  to  a 
quasi-homogeneous  one). 

4.  In  the  resonance  cases,  the  result  is  more  complicated. 

Example.  Let  n  =  2,  f3  =  —  1  (Poisson  structures  in  the  plane). 

Theorem.  The  number  of  moduli  for  a  germ  of  a  Poisson  structure  with  given 
singular  curve  f  —  0  equals  the  degree  of  non-quasi-homogeneity  of  the  germ 
of  f  augmented  by  one  less  than  the  number  of  irreducible  components  of  the 
germ  of  the  curve  f  =  0. 

In  resonance  cases,  the  number  of  moduli  behaves  in  a  rather  regular 
way  along  each  arithmetic  progression  with  difference  —  1.  Namely,  when  (3 
decreases  by  1  the  number  of  moduli  increases  (not  necessarily  strictly),  but 
its  maximal  value  does  not  exceed  (for  any  f3  >  —n)  the  “nonresonant”  value 
(i.e.,  the  degree  of  non-quasi-homogeneity  of  /)  by  more  than  the  number 
of  Jordan  blocks  associated  with  the  eigenvalue  e2ni0  of  the  monodromy 
operator  of  the  function  /. 

G  Poisson  structures  and  period  mappings 

An  interesting  source  of  Poisson  structures  is  provided  by  the  period  mappings 
of  critical  points  of  holomorphic  functions  (A.  N.  Varchenko  and  A.  B. 
Givental’,  Mapping  of  periods  and  intersection  form,  Funct.  Anal.  Appl.  16, 
(1982),  83-93). 

Period  mappings  allow  one  to  transfer  to  the  base  of  a  fibre  bundle  certain 
structures  which  live  on  the  (co)homology  spaces  of  the  fibres.  A  Poisson 
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structure  on  the  base  arises  in  this  way  from  the  intersection  form  in  the 
middle-dimensional  homology  of  the  fibres,  when  this  form  is  skew-symmetric. 

Period  mappings  are  defined  by  the  following  construction.  Suppose  that 
one  is  given  a  locally  trivial  fibration.  Associated  to  such  a  fibration  are  the 
bundles  (over  the  same  base)  of  homology  and  cohomology  of  the  fibres  with 
complex  coefficients.  These  bundles  are  not  only  locally  trivial,  but  they  are 
locally  trivialized  in  a  canonical  way  (the  integer  cycles  in  a  fibre  are  uniquely 
identifiable  with  integer  cycles  in  the  nearby  homology  fibres).  A  period 
mapping  is  defined  as  a  section  of  the  cohomology  bundle. 

Suppose  now  that  one  is  given,  on  the  total  space  of  a  differentiable  fibre 
bundle,  a  differential  form  which  is  closed  on  each  fibre.  The  period  mapping 
of  this  form  associates  to  each  point  of  the  base  the  cohomology  class  of  the 
form  on  the  fibre  over  this  point. 

If  one  is  given  a  vector  field  on  the  base  of  the  fibration,  then  any  (smooth) 
period  mapping  may  be  differentiated  along  this  vector  field,  and  the  derivative 
is  again  a  period  mapping.  In  fact,  neighboring  fibres  of  the  cohomology 
bundle  are  identified  with  one  another  by  the  above-mentioned  “integer”  local 
trivialization,  so  a  section  may  be  considered  (locally)  as  a  map  into  one  fibre 
and  may  be  differentiated  as  an  ordinary  (vector-valued)  function. 

Suppose  now  that  the  base  is  a  complex  manifold  having  the  same  complex 
dimension  as  the  fibres  of  the  cohomology  bundle.  A  period  mapping  is  called 
nondegenerate  if  its  derivatives  along  any  C-independent  vectors  at  each 
point  are  linearly  independent.  In  other  words,  a  period  mapping  is  non¬ 
degenerate  if  the  corresponding  local  maps  from  the  base  to  typical  fibres  are 
diffeomorphisms. 

The  derivative  of  a  nondegenerate  period  mapping  thus  allows  us  to  map 
the  tangent  bundle  of  the  base  isomorphically  onto  the  cohomology  bundle. 
The  dual  isomorphism  goes  from  the  homology  bundle  to  the  cotangent 
bundle  of  the  base.  This  isomorphism  transfers  to  the  base  any  additional 

structures  carried  by  the  homology  groups. 

Suppose  that  the  fibres  of  our  original  bundle  are  (real)  oriented  even 
dimensional  manifolds,  and  consider  their  homology  in  the  middle  dimension. 
In  this  case,  the  homology  of  each  fibre  carries  a  bilinear  form:  the  index  of 
intersection.  This  form  is  symmetric  if  the  dimension  of  the  fibre  is  a  multiple 
of  4;  otherwise,  it  is  skew-symmetric.  The  form  is  nondegenerate  if  the  fibre  is 
closed  (i.e.,  compact  and  without  boundary);  otherwise,  it  may  be  degenerate. 
We  shall  suppose  below  that  we  are  in  the  situation  where  the  form  is 
skew-symmetric. 

In  this  situation  a  nondegenerate  period  mapping  induces  a  Poisson  structure 
on  the  base.  In  fact,  the  isomorphism  described  above,  between  the  cotangent 
spaces  of  the  base  and  the  homology  groups  of  the  fibres  (carrying  their 
skew-symmetric  intersection  forms),  defines  a  skew-symmetric  bilinear  form 
on  pairs  of  cotangent  vectors.  The  Poisson  bracket  of  two  functions  on  the 
base  is  defined  as  the  value  of  this  form  on  the  differentials  of  the  functions. 

This  bracket  defines  a  Poisson  structure  (of  constant  rank)  on  the  base. 
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Figure  247  Poisson  structure  and  the  swallowtail 


This  is  obvious  from  the  fact  that  the  local  identification  of  the  base  with 
the  cohomology  of  the  typical  fibre,  given  by  the  period  mapping,  provides  the 
base  with  local  coordinates  whose  Poisson  brackets  are  constant.123 

Varchenko  and  Givental’  observed  that  if  one  constructs,  in  the  way  just 
described,  using  a  generic  1-form,  a  Poisson  structure  on  the  complement  of 
the  discriminant  locus  in  the  base  of  a  versal  deformation  of  a  critical  point 
of  a  function  of  two  variables,  then  this  structure  may  be  holomorphically 
extended  across  the  discriminant  locus.  (One  may  replace  the  discriminant 
locus  above  by  the  wave  front  of  a  typical  singularity.)  We  shall  limit  ourselves 
here  to  the  simplest  examples  of  Poisson  structures  arising  in  this  way. 

Consider  the  three-dimensional  space  of  polynomials  C3  =  (x4  +  Xlx2  + 
X2x  +  X3}  with  coordinates  Xk.  The  polynomials  with  multiple  roots  form 
therein  the  discriminant  surface  (a  swallowtail;  see  Figure  247). 

The  Poisson  structures  arising  from  period  mappings  may  be  reduced 
(by  diffeomorphisms  preserving  the  swallowtail)  to  the  following  form:  the 
symplectic  leaves  are  the  planes  X2  —  const.,  and  their  symplectic  structures 
are  of  the  form  dXv  a  dX3. 

The  fibration  of  interest  here  is  formed  by  the  complex  curves  {(x,  y): 
y2  =  x4  +  Xlx2  +  X2x  +  x3},  and  the  period  mapping  is  given  by,  for  example, 
the  form  y  dx.  (See  V.  I.  Arnold,  A.  N.  Varchenko,  S.  M.  Gusein-Zade, 
Singularities  of  Differentiable  Mappings ,  Vol.  2:  Monodromy  and  the  Asympto¬ 
tics  of  Integrals,  Birkhauser,  1988,  §15,  or  Uspekhi  Mat.  Nauk  40,  no.  5 
(1985).) 


123  In  the  case  where  the  intersection  form  is  symmetric,  the  analogous  construction  defines  on 
the  base  a  flat  pseudo-riemannian  (possibly  degenerate)  metric. 
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The  Poisson  structures  on  the  swallowtail  space  which  arise  from  period 
mappings  may  be  characterized  locally  among  all  generic  structures  by  the 
following  property:  the  line  of  self-intersections  of  the  tail  lies  entirely  in  one 
symplectic  leaf.  The  required  genericity  condition  is  that  the  tangent  planes  at 
the  origin  to  the  symplectic  leaf  and  the  swallowtail  do  not  coincide.  Every 
smooth  function  which  is  constant  along  the  line  of  self-intersections  of  the 
tail,  and  whose  derivative  along  the  symplectic  leaf  at  the  origin  is  nonzero, 
may  be  reduced  in  a  neighborhood  of  the  origin,  by  a  diffeomorphism  preserv¬ 
ing  the  tail,  to  the  form  l2  +  const.;  also,  a  family  of  holomorphic  symplectic 
structures  in  the  planes  k2  =  const,  may  be  reduced  to  the  form  dkx  a  dX3 
by  a  holomorphic  local  diffeomorphism  of  three-dimensional  space  which 
preserves  the  swallowtail  as  well  as  the  foliation  by  the  planes. 

One  may  conjecture  more  generally  that  those  Poisson  (in  particular, 
symplectic)  structures  on  the  base  of  a  versal  deformation  of  a  singularity,  in¬ 
duced  from  the  intersection  form  by  an  infinitesimally  stable  period  mapping, 
may  be  characterized  (up  to  diffemorphisms  preserving  the  bifurcation  set)  by 
a  natural  condition  on  the  rank  of  the  restricted  Poisson  structure  to  the  strata 
of  the  discriminant  locus.  The  “natural  condition”  in  the  three-dimensional 
example  above  is  that  the  line  of  self-intersections  of  the  swallowtail  be 
contained  in  a  symplectic  leaf.  In  four-dimensional  space,  an  analogous  role 
would  apparently  be  played  by  the  condition  that  a  certain  submanifold  be 
lagrangian,  namely,  the  manifold  of  polynomials  having  two  critical  points 
with  critical  value  zero  in  the  symplectic  space  of  polynomials  x5  +  Xxx3  + 
X2x2  +  X3x  +  X4  (the  ranks  of  the  symplectic  structure  on  the  tangent  spaces 
to  the  other  strata  may  also  be  important). 
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A  system  of  Jacobi’s  elliptic  coordinates  is  associated  to  each  ellipsoid  in 
euclidean  space.  These  coordinates  make  it  possible  to  integrate  the  equations 
of  geodesics  on  the  given  ellipsoid,  as  well  as  certain  other  equations,  such  as 
the  equations  of  motion  for  a  point  on  a  sphere  under  the  influence  of  a  force 
with  quadratic  potential,  or  for  a  point  on  a  paraboloid  under  the  influence 
of  a  uniform  gravitational  field. 

These  facts  suggest  that,  even  on  an  infinite-dimensional  Hilbert  space, 
there  should  be  a  class  of  integrable  systems  associated  to  each  symmetric 
operator.  To  study  these  systems,  it  is  necessary  to  extend  the  theory  of  elliptic 
coordinates  to  the  infinite-dimensional  case.  To  do  this,  it  is  first  necessary  to 
express  the  finite-dimensional  theory  of  confocal  quadric  surfaces  in  coordinate 
free  form. 

In  the  transition  to  the  infinite-dimensional  case,  symmetric  operators  on 
finite-dimensional  euclidean  spaces  must  be  replaced  by  self-adjoint  operators 
on  Hilbert  spaces.  Since  the  elliptic  coordinates  are  not  really  connected  with 
the  operator  itself,  but  rather  with  its  resolvent,  the  unboundedness  of  the 
original  operator  (which  might  be,  for  example,  a  differential  operator)  does 
not  present  a  serious  obstacle. 

In  some  cases,  the  elliptic  coordinates  on  Hilbert  space  obtained  from  a 
self-adjoint  operator  form  a  countable  sequence;  however,  when  the  operator 
has  a  continuous  spectrum,  the  coordinates  form  a  continuous  family.  In  this 
case,  the  transformation  from  the  original  point  of  the  Hilbert  space  (thought 
of  as  a  function  space)  to  the  continuous  family  of  elliptic  coordinates  of  the 
point  may  be  considered  as  a  nonlinear  mapping  between  function  spaces. 
This  mapping,  by  analogy  with  the  Fourier  transform,  might  be  called  the 
Jacobi  transform:  the  original  function  is  transformed  into  a  function  which 
expresses  the  elliptic  coordinates  in  terms  of  some  continuous  “index.”  (More 
precisely,  the  result  of  the  transform  is  a  measure  on  the  spectral  parameter 
axis.)  The  study  of  the  functional  analytic  properties  and  the  inversion  of  the 
Jacobi  transform  will  probably  be  accomplished  before  too  long. 

Following  an  exposition  of  the  general  theory  of  elliptic  coordinates,  we 
shall  describe  below  some  of  the  applications  of  these  coordinates  to  potential 
theory. 

This  appendix  is  based  on  the  following  papers  by  the  author. 

Some  remarks  on  elliptic  coordinates,  Notes  of  the  LOMI  Seminar  (volume 
dedicated  to  L.  D.  Faddeev  on  his  50th  birthday),  133  (1984),  38-50. 

Integrability  of  hamiltonian  systems  associated  with  quadrics  (after  J. 
Moser),  Uspekhi  34,  no.  5,  214. 

Some  algebro-geometrical  aspects  of  the  Newton  attraction  theory,  Pro¬ 
gress  in  Math.  (I.  R.  Shafarevich  volume),  36  (1983),  1-4. 
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A  Elliptic  coordinates  and  confocal  quadrics 

Elliptic  coordinates  in  euclidean  space  are  defined  with  the  aid  of  confocal 
quadrics  (surfaces  of  degree  two).  The  geometry  of  these  quadrics  is  obtained 
from  the  geometry  of  pencils  of  quadratic  forms  in  euclidean  space  (i.e.,  from 
the  theory  of  principal  axes  of  ellipsoids  or  from  the  theory  of  small  oscillations) 
by  a  passage  to  the  dual  space. 

Definition  1.  A  eucildean  pencil  of  quadrics  (resp.  quadratic  forms )  in  a  euclidean 
vector  space  V  is  a  one-parameter  family  of  surfaces  of  degree  two 

i(Axx,  x)  =  1 

(resp.  forms  Ax),  where 

Ax  =  A  —  XE  (E  =  “identity”), 
and  where  A  is  a  symmetric  operator 

A:  V  V*,  A*  =  A. 

Definition  2.  A  confocal  family  of  quadrics  in  a  euclidean  space  W  is  a  family 
of  quadrics  dual  to  the  quadrics  of  a  euclidean  pencil  in  W*: 

i(Al1t,Z)=  1. 

Thus,  quadrics  which  are  confocal  to  one  another  form  a  one-parameter 
family,  but  the  quadratic  forms  defining  the  family  do  not  depend  linearly  on 
the  parameter. 

Example.  The  family  of  plane  curves  which  are  confocal  to  a  given  ellipse 
consists  of  all  those  ellipses  and  hyperbolas  with  the  same  foci.  In  Figure  248, 
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Figure  248  A  confocal  family  and  the  corresponding  euclidean  pencil 


the  curves  of  a  confocal  family  are  shown  on  the  left,  and  the  curves  of  the 
corresponding  euclidean  pencil  are  shown  on  the  right. 

The  elliptic  coordinates  of  a  point  are  the  value  of  the  parameter  X  for  which 
the  corresponding  quadrics  of  a  fixed  confocal  family  pass  through  the  point. 
We  fix  an  ellipsoid  in  eucildean  space  with  all  its  axes  of  different  lengths. 

Theorem  1  (Jacobi).  Through  each  point  of  an  n-dimensional  euclidean  space 
there  pass  n  quadrics  confocal  to  a  given  ellipsoid.  Smooth  confocal  quadrics 
intersect  at  right  angles. 

Proof.  Each  point  other  than  0  in  our  space  corresponds  to  an  affine  hyper¬ 
plane  in  the  dual  space,  consisting  of  those  linear  functionals  whose  value  is 
1  at  the  given  point.  In  terms  of  the  dual  space.  Theorem  1  means  that  every 
hyperplane  not  passing  through  0  in  an  n-dimensional  euclidean  space  is 
tangent  to  precisely  n  of  the  quadrics  in  a  euclidean  pencil,  and  the  vectors 
from  0  to  the  points  of  tangency  are  pairwise  orthogonal  (Figure  248,  right). 

The  proof  of  the  property  of  euclidean  pencils  just  stated  is  based  on  the 
fact  that  the  aforementioned  vectors  define  the  principal  axes  of  the  qua¬ 
dratic  forms  B  =  j{Ax,  x)  —  |(/,  x)2,  where  (/,  x)  =  1  is  the  equation  of  the 
hyperplane. 

As  a  matter  of  fact,  on  a  principal  axis  of  any  quadratic  form  B,  corre¬ 
sponding  to  the  proper  value  A,  the  form  B  —  XE  reduces  to  0  along  with  its 
gradient.  The  vanishing  of  this  form  at  the  point  of  intersection  of  the 
principal  axis  and  the  hyperplane  means  that  the  point  of  intersection  lies  on 
the  quadric  \{Ax,  x)  =  1,  while  the  vanishing  of  the  gradient  means  that  the 
quadric  and  the  hyperplane  are  tangent  at  the  point.  □ 

Theorem  2  (Chasles).  Given  a  family  of  confocal  quadrics  in  n-dimensional 
euclidean  space ,  a  line  in  general  position  is  tangent  ton  —  1  different  quadrics 
in  the  family ,  and  the  planes  tangent  to  the  quadrics  at  the  points  of  tangency 
are  pairwise  orthogonal. 
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Proof.  We  project  the  quadrics  in  the  confocal  family  along  a  pencil  of  parallel 
lines  onto  the  hyperplane  perpendicular  to  the  pencil.  Each  quadric  defines  an 
apparent  contour  (the  set  of  critical  values  of  the  projection  of  the  quadric). 
For  a  projection  whose  direction  is  in  general  position,  the  apparent  contour 
is  a  quadric  (i.e.,  a  surface  of  degree  two)  in  the  image  hyperplane. 

Here  we  need  a  lemma. 

Lemma.  The  apparent  contours  of  the  quadrics  in  a  confocal  family  form 
themselves  a  confocal  family  of  quadrics. 

Proof.  On  passage  to  the  dual,  sections  become  projections  and  vice  versa. 
The  apparent  contours  of  the  projections  of  confocal  quadrics  along  a  pencil 
of  parallel  lines  are  therefore  dual  to  the  sections  of  the  dual  quadrics  by  a 
hyperplane  passing  through  the  origin. 

The  sections  of  the  quadrics  in  a  euclidean  pencil  by  a  hyperplane  through 
0  form  a  euclidean  pencil  of  quadrics  in  the  hyperplane.  The  lemma  now 
follows  by  duality.  ^ 

Returning  to  the  proof  of  Theorem  2,  we  apply  the  lemma  above  to  the 
projections  along  the  line  in  the  statement  of  the  theorem.  According  to  the 
lemma,  the  apparent  contours  of  the  projections  of  the  confocal  quadrics  in 
Theorem  2  form  a  confocal  family  of  quadrics  in  a  hyperplane.  By  Theorem  1, 
n  —  l  of  these  apparent  contours  pass  through  each  point,  where  they  intersect 
at  right  angles.  This  completes  the  proof  of  Theorem  2.  □ 

Theorem  3  (Jacobi  and  Chasles).  Given  a  geodesic  on  a  quadric  Q  in  n- 
dimensional  space ,  there  is  a  set  of  n  —  2  quadrics  confocal  to  Q  such  that  all 
the  tangent  lines  to  the  geodesic  are  also  tangent  to  the  quadrics  in  the  set. 


Proof  (Beginning).  We  consider  the  manifold  of  oriented  lines  in  euclidean 
space.  This  manifold  has  a  natural  symplectic  structure  as  the  manifold  of 
characteristics  in  the  hypersurface  p2  —  1  in  the  phase  space  of  a  free  particle 
moving  under  its  own  inertia  in  our  euclidean  space. 

(The  characteristics  on  a  hypersurface  in  a  symplectic  manifold  are  the 
integral  curves  of  the  field  of  characteristic  directions,  i.e.,  the  field  of  directions 
which  are  skew-orthogonal  to  the  tangent  spaces  of  the  hypersurface.  In  other 
words,  the  characteristics  of  the  hypersurface  are  the  phase  curves  for  any 
hamiltonian  flow  whose  hamiltonian  function  vanishes  to  first  order  on  the 
hypersurface. 

The  symplectic  structure  on  the  manifold  of  characteristics  on  a  hyper- 
surface  in  a  symplectic  manifold  is  defined  in  such  a  way  that  the  skew-scalar 
product  of  any  two  vectors  tangent  to  the  hypersurface  is  equal  to  the  skew- 
scalar  product  of  their  projections  in  the  manifold  of  characteristics. 

Note,  finally,  that  the  notion  of  characteristics  is  equally  well  defined  for 
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any  submanifold  of  a  symplectic  manifold  on  which  the  induced  2-form  has 
constant  nullity.  The  characteristics  then  have  dimension  equal  to  that  nullity, 
and  the  manifold  of  characteristics  still  inherits  a  symplectic  structure.)  □ 

Lemma  A.  Each  characteristic  of  the  manifold  of  lines  tangent  to  a  given 
hypersurface  in  euclidean  space  consists  of  all  the  lines  tangent  to  a  single 
geodesic  on  the  hypersurface. 

Proof  of  Lemma  A.  For  efficiency  of  expression,  we  will  identify  the  cotangent 
vectors  to  euclidean  space  with  tangent  vectors  by  using  the  euclidean  structure, 
so  that  our  original  phase  space  is  represented  as  the  space  of  vectors  based 
at  points  of  eucildean  space  (i.e.,  momenta  are  identified  with  velocities).  The 
unit  vectors  to  the  given  hypersurface  form  a  submanifold  of  odd  codimension 
(equal  to  3)  in  phase  space.  The  characteristics  of  this  submanifold  define  the 
geodesic  flow  on  the  hypersurface. 

The  map  which  assigns  to  each  vector  the  line  in  which  it  lies  takes  the 
codimension  3  submanifold  just  described  to  the  manifold  of  lines  tangent 
to  the  hypersurface.  Under  this  mapping,  characteristics  are  transformed  to 
characteristics  (with  respect  to  the  symplectic  structure  on  the  space  of  lines). 
This  proves  the  lemma.  □ 

[ Remark.  The  preceding  argument  may  be  easily  extended  to  the  following 
general  situation,  first  considered  by  Melrose.  Let  Y  and  Z  be  a  pair  of  hy¬ 
persurfaces  in  a  symplectic  manifold  X  which  intersect  transversally  along  a 
submanifold  W.  We  consider  the  manifolds  of  characteristics  B  and  C  of  the  hy¬ 
persurfaces  7  and  Z  together  with  the  canonical  quotient  fibrations  Y  -*-»  B 
and  Z  ->->■  C;  the  manifolds  B  and  C  inherit  symplectic  structures  from  X. 

In  the  intersection  W,  there  is  a  distinguished  hypersurface  (of  codimension 
3  in  X)  consisting  of  points  at  which  the  restriction  to  W  of  the  symplectic 
structure  on  X  is  degenerate.  This  hypersurface  E  in  W  may  also  be  defined 
as  the  set  of  critical  points  of  the  composed  mapping  W  -++B  (or 
— »-►  C.  if  one  wishes).  These  objects  form  the  following  commutative 
diagram: 


The  analogue  to  Lemma  A  in  this  situation  is  the  assertion  that  the 
characteristics  on  the  images  of  the  mappings  Z  -*•  B  and  E  -*■  C  are  the  images 
of  one  and  the  same  curve  on  E  (namely,  the  characteristics  of  E  considered 
as  a  submanifold  of  the  symplectic  manifold  A). 

Lemma  A  itself  is  the  special  case  of  the  assertion  above  in  which  X  =  (R2n 
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(the  phase  space  of  a  free  particle  in  IR"),  the  hypersurface  Y  consists  of  the  unit 
vectors  (given  by  the  condition  p2  =  1,  i.e.,  a  level  surface  of  the  hamiltonian 
for  a  free  particle),  and  the  hypersurface  Z  consists  of  those  vectors  which  are 
based  at  the  points  of  the  given  hypersurface  in  R".  In  this  case,  B  is  the 
manifold  of  all  oriented  lines  in  euclidean  space,  and  I  is  the  manifold  of  unit 
vectors  tangent  to  the  hypersurface.  The  mapping  £  ->  B  assigns  to  each  unit 
vector  the  line  which  contains  it.  The  manifold  C  is  the  (co)tangent  bundle  of 
the  given  hypersurface.  £  —*•  C  is  the  embedding  into  this  bundle  of  its  unit 
sphere  bundle  (in  other  words,  the  embedding  of  a  level  surface  of  the  kinetic 
energy,  i.e.,  the  hamiltonian  for  motion  constrained  to  the  hypersurface). 

It  is  always  useful  to  keep  the  diagram  above  in  mind  when  one  is  dealing 
with  constraints  in  symplectic  geometry.] 

Proof  of  Theorem  3  (Middle).  We  suppose  given  a  smooth  function  on 
euclidean  (configuration)  space  whose  restriction  to  a  certain  line  has  a  non¬ 
degenerate  critical  point.  In  this  situation,  the  function  will  also  have  a  critical 
point  when  restricted  to  each  nearby  line;  i.e.,  on  each  nearby  line,  there  will 
be  a  nearby  point  where  the  line  is  tangent  to  a  level  surface  of  the  function. 
The  value  of  the  function  at  the  critical  point  is  thus  a  function  (defined  locally) 
on  the  space  of  lines.  We  call  this  function  of  lines  the  induced  line  function 
(from  the  original  point  function).  D 

Lemma  B.  If  two  point  functions  in  euclidean  space  are  such  that  the  tangent 
planes  to  their  level  surfaces  are  orthogonal  at  the  points  where  a  given  line 
is  tangent  to  these  surfaces  ( these  points  being  in  general  different  for  the 
two  functions ),  then  the  Poisson  bracket  of  the  induced  line  functions  is  zero 
at  the  given  line  ( considered  as  a  point  in  the  space  of  lines). 

Proof  of  Lemma  B.  We  calculate  the  derivative  of  the  second  induced  line 
function  along  the  phase  flow  whose  hamiltonian  is  the  first  induced  function. 
The  phase  curves  for  the  first  induced  function,  which  lie  on  its  level  surfaces, 
are  the  characteristics  of  those  surfaces.  A  level  surface  for  the  first  induced 
function  consists  of  those  lines  which  are  tangent  to  a  single  level  surface  of 
the  first  point  function.  Each  characteristic  of  this  surface,  according  to 
Lemma  A,  consists  of  the  lines  which  are  tangent  to  a  single  geodesic  on  the 
level  surface  of  the  first  point  function. 

For  an  infinitesimally  small  displacement  of  a  point  on  a  geodesic  in 
a  surface,  the  tangent  line  to  the  geodesic  rotates  (up  to  infinitesimal  quantities 
of  higher  order)  in  the  plane  spanned  by  the  original  tangent  and  the  normal 
to  the  surface.  By  hypothesis,  the  tangent  plane  to  the  level  surface  of  the 
second  function  at  the  point  where  this  surface  is  tangent  to  our  line  is 
perpendicular  to  the  tangent  plane  of  the  level  surface  of  the  first  function. 
Therefore,  under  the  above-mentioned  infinitesimally  small  rotation,  the  line 
remains  tangent  to  the  same  level  surface  of  the  second  function  (up  to 
infinitesimals  of  higher  order).  It  follows  that  the  rate  of  change  of  the  second 
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induced  function  under  the  action  of  the  phase  flow  given  by  the  first  is  zero 
at  the  element  in  question  of  the  space  of  lines,  which  proves  Lemma  B.  □ 

Proof  of  Theorem  3  (End).  We  fix  a  line  in  general  position  in  IR“.  According 
to  Theorem  2,  this  line  is  tangent  to  n  —  1  quadrics  in  the  confocal  family,  at 
n  -  1  points.  We  construct  in  the  neighborhood  of  each  of  these  points  a 
smooth  function,  without  critical  points,  whose  level  surfaces  are  the  quadrics 
of  our  confocal  family. 

We  fix  one  of  these  quadrics  (the  “first”)  and  consider  the  hamiltonian 
system  on  the  space  of  lines  whose  hamiltonian  function  is  the  first  induced 
line  function.  Each  of  its  phase  curves  on  a  fixed  level  surface  of  the  ham¬ 
iltonian  function  consists  of  the  tangent  lines  to  one  geodesic  of  that  quadric 
(Lemma  A).  The  remaining  induced  functions  have  zero  Poisson  bracket  with 
the  hamiltonian,  by  Lemma  B  (since  the  planes  tangent  to  the  confocal 
surfaces  at  the  points  where  they  touch  one  line  are  orthogonal,  by  Theorem  2). 

Thus  all  the  induced  functions  are  first  integrals  for  the  hamiltonian  system 
generated  by  any  one  of  them.  Since  the  lines  tangent  to  a  geodesic  on  the  first 
quadric  form  a  phase  curve  of  the  first  system,  all  the  induced  functions  take 
constant  values  on  this  curve.  That  proves  Theorem  3,  as  well  as  the  following 
result.  q 

Theorem  4.  The  geodesic  flow  on  a  central  surface  of  degree  2  in  euclidean  space 
is  a  completely  integrable  system  in  the  sense  of  Liouville  (i.e.,  it  has  as  many 
independent  integrals  in  involution  as  it  has  degrees  of  freedom). 

Remark.  Strictly  speaking,  we  proved  Theorem  3  only  for  lines  in  general 
position,  but  the  result  extends  by  continuity  to  the  exceptional  cases  (in 
particular,  to  asymptotic  lines  of  our  quadrics).  In  the  same  way,  Theorem  4 
was  initially  proved  just  for  quadrics  with  unequal  principal  axes,  but  passage 
to  a  limit  extends  the  result  to  more  symmetric  quadrics  of  revolution  (as  well 
as  to  noncentral  “paraboloids”). 

B  Magnetic  analogues  of  the  theorems  of  Newton  and  Ivory 

Elliptic  coordinates  make  it  possible  to  extend  Newton’s  well-known  theorem 
on  the  gravitational  attraction  of  a  sphere  to  the  case  of  attraction  by  an 
ellipsoid. 

Definition.  A  homeoidal  density  on  the  surface  of  an  ellipsoid  E  is  the  density 
of  a  layer  between  E  and  an  infinitely  nearby  ellipsoid  which  is  homothetic 
to  E  (with  the  same  center). 

The  following  is  a  well-known  result. 

Ivory’s  Theorem.  A  finite  mass,  distributed  on  the  surface  of  an  ellipsoid  with 
homeoidal  density,  does  not  attract  any  internal  point;  it  attracts  every 
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external  point  the  same  way  as  if  the  mass  were  distributed  with  homeoidal 

density  on  the  surface  of  a  smaller  confocal  ellipsoid. 

The  attraction  in  Ivory’s  theorem  is  defined  by  the  law  of  Newton  or 
Coulomb:  in  n-dimensional  space,  the  force  is  proportional  to  r1-n  (as  pre¬ 
scribed  by  the  fundamental  solution  of  Laplace’s  equation). 

Newton’s  theorem  on  the  (non)attraction  of  an  internal  point  carries  over 
to  the  case  of  a  hyperbolic  homeoidal  layer  and  to  the  case  of  an  attracting 
mass  distributed  on  a  level  hypersurface  of  a  hyperbolic  polynomial  of  any 
degree.  (A  polynomial  of  degree  m,  f(xti...,xn)  is  called  hyperbolic  if  its 
restriction  to  any  line  through  the  origin  has  all  its  roots  real.) 

A  homeoidal  charge  density  on  the  zero  hypersurface  /  =  0  of  a  hyperbolic 
polynomial  is  defined  as  the  density  of  a  homogeneous  infinitesimally  thin 
layer  between  the  hypersurfaces  /  =  0  and  /  =  e  — ►  0  (the  signs  of  the  charges 
being  chosen  so  that  successive  ovaloids  have  opposite  charges). 

[A  homeoidal  charge  does  not  attract  the  origin  (nor  any  other  point  within 
the  innermost  ovaloid ),  and  this  property  is  preserved  if  the  charge  density  is 
multiplied  by  any  polynomial  of  degree  at  most  m  —  2. 

Generalization:  If  a  homeoidal  charge  density  is  multiplied  by  any  polynomial 
of  degree  m  -  2  +  r,  then  the  potential  inside  the  innermost  ovaloid  is  a  harmonic 
polynomial  of  degree  r  (A.  B.  Givental’,  1983).] 

When  one  attempts  to  find  a  version  for  hyperboloids  of  Ivory’s  theorem 
on  the  attraction  of  confocal  ellipsoids,  it  turns  out  that  an  essential  role  is 
played  by  the  topology  of  the  hyperboloids.  When  passing  to  hyperboloids 
of  different  signatures,  one  must  consider,  instead  of  homeoidal  densities, 
harmonic  forms  of  different  degrees,  and  instead  of  the  Newton  or  Coulomb 
potential,  the  corresponding  generalized  forms-potentials  given  by  the  Biot- 
Savart  law. 

In  the  simplest  nontrivial  case  of  a  hyperboloid  of  one  sheet  in  three- 
dimensional  euclidean  space,  the  result  is  as  follows. 

The  hyperboloid  divides  space  into  two  parts:  “internal”  and  “external, 
the  latter  being  nonsimply  connected.  We  consider  elliptic  coordinate  curves 
from  the  system  whose  level  surfaces  are  the  quadrics  confocal  to  the  given 
hyperboloid. 

The  elliptic  coordinate  curves  on  our  hyperboloid,  which  are  obtained 
by  intersecting  with  the  confocal  ellipsoids  (closed  lines  of  curvature  on 
the  hyperboloid),  are  called  the  parallels  of  the  hyperboloid.  The  orthogonal 
curves,  obtained  by  intersection  with  the  two-sheeted  hyperboloids,  are  called 
the  meridians. 

Although  the  elliptic  coordinate  system  has  singularities  (on  each  symmetry 
plane  of  the  quadrics  in  the  family),  the  hyperboloid  is  smoothly  fibred  by 
the  parallels  (diffeomorphic  to  the  circle)  and  meridians  (diffeomorphic  to 
the  line). 

The  region  inside  the  hyperboloidal  tube  is  also  smoothly  fibred  by  meri¬ 
dians  (orthogonal  to  the  ellipsoids  in  the  confocal  family),  while  the  ann'dar 
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Figure  249  Magnetic  fields  generalizing  the  theorems  of  Newton  and  Ivory 


region  outside  the  hyperboloid  is  smoothly  fibred  by  parallels  (orthogonal  to 
the  hyperboloids  of  two  sheets). 

Theorem.  A  current  with  a  suitable  density,  flowing  along  the  meridians  of  a 
hyperboloid,  produces  a  magnetic  field  which  is  zero  inside  the  hyperboloidal 
tube,  while  the  field  in  the  annular  exterior  region  is  directed  along  the 
parallels.  A  current  with  a  suitable  density,  flowing  along  the  parallels  of  a 
hyperboloid,  produces  a  magnetic  field  which  is  zero  in  the  exterior  annular 
region,  while  the  field  inside  the  hyperboloidal  tube  is  directed  along  the 
meridians.  ( See  Figure  249.) 

The  current  densities  giving  rise  to  such  magnetic  fields,  which  generalize 
the  homeoidal  charge  densities  on  ellipsoids,  may  be  described  in  the  following 
way.  There  are  associated  to  each  family  of  confocal  quadrics  in  three- 
dimensional  euclidean  space  two  “focal  curves”:  an  ellipse  and  a  hyperbola. 
(See  Figure  250.)  The  focal  ellipse  is  the  boundary  of  the  limiting  ellipsoid  of 
the  family  in  which  the  shortest  axis  shrinks  to  zero;  the  focal  hyperbola  arises 
in  a  similar  way  from  the  hyperboloids  of  one  or  two  sheets. 


Figure  250  Focal  ellipse  and  focal  hyperbola 
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We  define  a  homeoidal  density  on  a  focal  ellipse  in  the  following  way.  To 
begin  we  consider  any  nonplanar  parallel,  defined  as  the  nonplanar  inter¬ 
section  of  an  ellipsoid  with  a  hyperboloid  of  one  sheet.  A  homeoidal  density 
on  this  parallel  is  defined  as  the  density  on  an  infinitesimally  thin  “wire,” 
obtained  by  intersecting  the  layer  between  the  given  ellipsoid  and  a  homothetic 
one  infinitesimally  nearby  with  the  layer  between  the  given  hyperboloid  and 
a  homothetic  one  infinitesimally  close  by,  both  homotheties  being  taken  with 
respect  to  the  center  of  the  confocal  family.  We  normalize  this  homeoidal 
density  on  the  parallel  in  such  a  way  that  the  mass  of  the  entire  parallel  is  equal 

to  1. 

Now  we  consider  the  focal  ellipse  as  a  limit  of  nonplanar  parallels.  It  turns 
out  that  the  normalized  homeoidal  densities  on  the  parallels  have  a  well- 
defined  limit  as  the  parallels  approach  the  focal  ellipse.  This  limiting  density 
is  called  the  homeoidal  density  on  the  focal  ellipse. 

The  homeoidal  density  on  a  focal  hyperbola  is  defined  in  an  analogous  way. 

We  may  now  describe  the  current  densities  referred  to  as  “suitable”  in  the 
theorem  above  on  magnetic  fields.  The  surface  of  a  hyperboloid  of  one  sheet 
is  fibred  over  the  focal  ellipse  (the  fibre  over  a  point  is  the  meridian  which  lies 
on  the  same  hyperboloid  of  two  sheets  as  that  point). 

The  flux  of  the  meridional  current  suitable  for  the  theorem,  through  any  curve 
on  the  hyperboloid,  equals  the  integral  of  the  homeoidal  density  form  on  the 
focal  ellipse  over  the  projection  of  that  curve  onto  the  focal  ellipse  ( along  the 
hyperboloids  of  two  sheets). 

The  density  of  the  flow  along  the  parallels  is  induced  in  an  analogous  way 
from  the  homeoidal  density  on  the  focal  hyperbola. 

Remark.  The  magnetic  field  of  the  parallel  flow  with  the  indicated  density, 
inside  the  hyperboloidal  tube,  coincides  outside  each  confolal  ellipsoid  (up  to 
sign)  with  the  newtonian  or  coulombian  field  produced  by  a  charge  which  is 
distributed  with  homeoidal  density  on  that  ellipsoid.124 

In  exactly  the  same  way,  the  magnetic  field  in  the  annular  domain  outside 
the  hyperboloid  of  one  sheet  coincides  (up  to  sign),  in  the  region  between  the 
sheets  of  each  confocal  hyperboloid  of  two  sheets,  with  the  coulombian  field 
produced  by  two  equal  charges  with  opposite  signs  distributed  on  the  two 
sheets  of  the  hyperboloid  with  homeoidal  density  (O.  P.  Shcherbak). 

The  results  formulated  above  have  recently  been  extended  by  B.  Z.  Shapiro 
and  A.  D.  Vainshtein  to  hyperboloids  in  euclidean  spaces  of  any  number  of 
dimensions.  For  a  hyperboloid  in  IR",  diffeomorphic  to  Sk  x  [R ,  a  harmonic 
k-form  is  constructed  on  the  exterior  region  (diffeomorphic  to  the  product  of 
Sk  with  a  half-space)  and  a  harmonic  /-form  is  constructed  on  the  interior. 

The  corresponding  homeoidal  densities  are  defined  on  the  focal  ellipsoid 
with  codimension  k  and  the  focal  hyperboloid  of  two  sheets  with  codimension 

124  This  is  actually  the  density  with  which  a  charge  will  distribute  itself  on  the  surface  of  a 
conducting  ellipsoid. 
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l  by  the  same  limiting  procedure  that  we  described  above  for  k  —  l  —  1, 
using  the  intersections  of  layers  between  infinitesimally  close  and  homothetic 
quadrics, 

Noncomputational  proofs  of  these  geometric  theorems  are  unknown,  even 
for  the  special  case  of  magnetic  fields  in  three-dimensional  space. 

Remark.  The  presence  of  distinguished  harmonic  forms  on  hyperboloids 
and  in  their  complementary  domains  suggests  that  one  might  try  to  find 
filtrations,  analogous  to  those  arising  in  the  theory  of  mixed  Hodge  structures, 
in  spaces  of  differential  forms  on  noncompact  (and  possibly  even  singular) 
algebraic  and  semialgebraic  real  manifolds. 
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The  simplest  example  of  a  ray  system  is  the  system  of  normals  to  a  surface  in 
euclidean  space. 

In  a  neighborhood  of  a  smooth  surface,  its  normals  form  a  smooth  fibration, 
but  at  some  distance  from  the  surface  various  normals  begin  to  intersect  one 
another  (Figure  251).  The  complicated  figures  which  are  thereby  formed  were 
already  investigated  by  Archimedes,  but  their  full  details  were  not  revealed 
until  the  discovery  in  1972  of  the  relation  between  singularities  of  ray  systems 
and  the  theory  of  groups  generated  by  reflections. 

This  relation,  for  which  there  is  no  evident  a  priori  reason  (and  which  is  as 
surprising  as,  say,  the  relation  between  the  problems  of  tangents  and  areas), 
has  turned  out  to  be  a  powerful  instrument  for  the  study  of  critical  points  of 
functions.  By  1978,  it  had  become  clear  that  the  theory  of  reflection  groups 
also  governs  the  singularities  of  the  Huygens  evolvents. 

Huygens  (1654)  discovered  that  the  evolvent  of  a  plane  curve  has  a  cusp 
singularity  at  each  point  where  it  meets  the  curve  (Figure  252).  Evolents  of 
plane  curves  and  their  higher-dimensional  generalizations  are  wave  fronts 
on  manifolds  with  boundary.  Singularities  of  wave  fronts,  like  those  of  ray 
systems,  are  classified  in  terms  of  reflection  groups. 

While  rays  and  fronts  on  manifolds  without  boundary  are  related  to  the 
Weyl  groups  in  the  A,  D,  and  E  series,  singularities  of  evolvents  are  described 
by  the  groups  of  types  B,  C ,  and  F  (the  ones  with  double  connections  in  their 
Dynkin  diagrams). 

The  remaining  reflection  groups  (I2(p),  H3,  H4)  continued  for  some  time  to 
have  no  visible  relation  to  the  theory  of  singularities.  This  situation  changed 
in  the  fall  of  1982  when  it  was  discovered  that  the  symmetry  group  H3  of  the 
icosahedron  governs  the  singularities  of  evolvent  systems  in  the  neighborhood 
of  inflection  points  of  plane  curves. 

The  appearance  of  the  icosahedron  at  an  inflection  point  of  a  curve  looks 
as  mystical  as  the  icosahedron  in  Kepler’s  law  of  planetary  distances.  But  the 
presence  of  the  icosahedron  here  is  not  an  accident:  upon  the  investigation  in 
1984  of  more  complicated  systems  of  rays  and  fronts,  the  remaining  group  //4 
appeared. 

We  shall  give  in  this  appendix  a  brief  description  of  the  theory  of  singularities 
of  ray  systems.  Further  details  may  be  found  in  the  following  references: 

V.  I.  Arnold,  Singularities  of  ray  systems,  Russian  Math.  Surveys  38  (1983). 

V.  I.  Arnold,  Singularities  in  variational  calculus,  J.  Soviet  Math.  27 
(1984),  2679-2713. 

O.  V.  Lyashko,  Classification  of  critical  points  of  functions  on  a  manifold 
with  singular  boundary,  Funct.  Anal.  Appl.  17  (1983),  187-193. 

O.  P.  Shcherbak,  Singularities  of  families  of  evolvents  in  the  neighborhood 
of  an  inflection  point  of  the  curve,  and  the  group  H3,  generated  by  relections, 
Funct.  Anal.  Appl.  17  (1983),  301-303. 

A.  N.  Varchenko  and  S.  V.  Chmutov,  Finite  irreducible  groups,  generated 
by  relections,  are  monodromy  groups  of  suitable  singularities,  Funct.  Anal. 
Appl.  18  (1984),  171-183. 
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Figure  251  A  caustic  as  the  envelope  of  rays 


Figure  252  An  evolvent  of  a  curve 


V.  I.  Arnold,  Singularities  of  solutions  of  variational  problems  (Seminar 
report,  in  Russian),  Uspekhi  Mat.  Nauk  39,  no.  5  (1984),  256. 

O.  P.  Shcherbak,  Wave  fronts  and  reflection  groups.  Russian  Math.  Surveys, 
43,  no.  3  (1988). 

Itogi  Nauki  i  Techniki,  Sovremennye  Problemy  matematiki,  Noveishie 
dostijenia,  Moscow,  VINITI,  vol.  33(1988).  English  translation:!.  Sov.  Math. 
27  (1984). 

Many  of  the  results  which  we  will  describe  concern  such  simple  geometric 
objects  that  it  is  surprising  that  they  were  not  already  known  in  classical  times. 
For  instance,  the  local  classification  of  projections  of  generic  surfaces  in 
three-dimensional  space  was  not  discovered  until  1981.  The  number  of  equi¬ 
valence  classes  of  germs  of  projections  turned  out  to  be  finite — namely  14: 
neighborhoods  of  points  on  generic  surfaces  can  have  that  many  different 
appearances  when  viewed  from  different  points  in  space. 

A  Symplectic  manifolds  and  my  systems 

1.  The  space  of  oriented  lines  in  euclidean  space  may  be  identified  with 
the  (co)tangent  bundle  of  the  sphere  (Figure  253),  and  it  thereby  obtains  a 
symplectic  structure. 

2,  More  generally,  we  consider  any  hypersurface  in  a  symplectic  manifold. 
The  skew-orthogonal  complement  to  its  tangent  space  at  each  point  is  called 
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Figure  253  The  space  of  oriented  lines  in  euclidean  space 

the  characteristic  direction.  The  integral  curves  of  the  field  of  characteristic 
directions  on  a  hypersurface  are  called  characteristics.  The  manifold  of  char¬ 
acteristics  inherits  a  symplectic  structure  from  the  original  manifold. 

3.  In  particular,  the  manifold  of  extremals  of  a  general  variational  problem 
carries  a  symplectic  structure. 

4.  We  consider  the  space  of  binary  forms  (homogeneous  polynomials  in 
two  variables)  of  a  particular  odd  degree.  The  group  of  linear  transformations 
of  the  plane  acts  on  this  even  dimensional  linear  space.  Up  to  multiplication 
by  a  constant,  there  is  a  unique  nondegenerate  skew-symmetric  form  on  this 
space  which  is  invariant  under  the  action  of  the  group  SL(2)  of  linear  trans¬ 
formations  with  determinant  equal  to  1.  This  form  gives  a  natural  symplectic 
structure  on  the  manifold  of  binary  forms  of  each  odd  degree. 

5.  The  binary  forms  in  x  and  y  for  which  the  coefficient  of  x2t+1  is  unity 
form  a  hypersurface  in  the  space  of  all  forms.  The  manifold  of  characteristics  of 
this  hypersurface  is  naturally  identified  with  the  manifold  of  monic  polynomials 
of  even  degree  x2k  +  ■  •  •  in  x.  We  have  thereby  defined  a  natural  symplectic 
structure  on  this  space  of  polynomials. 

6.  The  one-parameter  group  of  translations  along  the  x-axis  preserves  the 
symplectic  structure  just  introduced.  The  hamiltonian  function  for  this  group 
is  a  quadratic  polynomial  (found  already  by  Hilbert  (1893)).  The  manifold 
of  characteristics  for  any  level  surface  of  this  hamiltonian  function  may  be 
identified  with  the  manifold  of  monic  polynomals  of  degree  2k  —  1  in  x  for  which 
the  sum  of  the  roots  is  zero.  Thus  we  have  a  natural  symplectic  structure  on 
this  space  of  polynomials. 

B  Submanifolds  of  symplectic  manifolds 

The  restriction  of  a  symplectic  structure  to  a  submanifold  is  a  closed  2-form, 
but  it  is  not  necessarily  nondegenerate.  For  submanifolds  in  euclidean  space 
there  is,  in  addition  to  the  intrinsic  geometry,  an  extensive  theory  of  extrinsic 
curvatures.  In  symplectic  geometry,  the  situation  is  simpler: 
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Theorem  (A.  B.  Givental’,  1981).  The  restriction  of  the  symplectic  form  to  a 
germ  of  a  submanifold  in  a  symplectic  manifold  determines  the  germ  up  to  a 
symplectic  diffeomorphism  of  the  ambient  manifold. 

An  intermediate  theorem,  in  which  one  uses  the  values  of  the  symplectic 
form  at  all  vectors  based  on  the  submanifold,  not  just  those  tangent  to  it,  was 
proved  earlier  by  A.  Weinstein  (1971).  Unlike  Weinstein’s  theorem,  Givental’s 
theorem  makes  it  possible  to  classify  generic  submanifold  germs  in  symplectic 
manifolds:  it  is  sufficient  to  use  the  classification  of  degenerate  symplectic 
structures  obtained  by  J.  Martinet  (1970)  and  his  successors. 

Examples.  1.  A  generic  two-dimensional  surface  in  symplectic  space  is  sym- 
plectically  diffeomorphic  in  a  neighborhood  of  each  point  with  the  surface 
P2  =  Pi>  P3  =  <?3  =  ‘  =  0  (in  Darboux  coordinates).  2.  On  four-dimensional 

submanifolds,  one  finds  stable  curves  of  elliptic  and  hyperbolic  Martinet 
singular  points  with  normal  forms 

P2  =  P1P3  ±  (hQi  +  p3  =  0,  P4  =  04  =  *"  =  O. 

[The  ellipticity  or  hyperbolicity  of  a  singular  point  is  determined  by  the 
nature  of  the  dynamical  system  invariantly  attached  to  the  submanifold.  The 
divergence-free  vector  fields  in  three-dimensional  space  which  arise  have 
entire  curves  of  singular  points.  The  classification  of  singular  lines  turns  out 
to  be  less  pathological  than  the  classification  of  singular  points  (which  is 
almost  as  difficult  as  all  of  celestial  mechanics).] 

This  concludes  a  description  of  the  first  steps  in  the  theory  of  symplectic 
singularities  on  smooth  manifolds. 

C  Lagrangian  submanifolds  in  the  theory  of  ray  systems 

We  recall  that  a  lagrangian  submanifold  is  a  submanifold  of  symplectic  space 
on  which  the  symplectic  structure  pulls  back  to  zero  and  which  has  the  highest 
possible  dimension  consistent  with  this  property  (equal  to  half  the  dimension 
of  the  ambient  manifold). 

Examples.  1.  Each  fibre  of  a  cotangent  bundle  is  lagrangian.  2.  The  manifold  of 
all  oriented  normals  to  a  smooth  submanifold  (of  any  dimension)  in  euclidean 
space  is  a  lagrangian  submanifold  of  the  space  of  lines.  3.  The  manifold  of  all 
polynomials  x2m  +  •  •  •  divisible  by  xm  is  lagrangian. 

A  lagrangian  fibration  is  a  fibration  all  of  whose  fibres  are  lagrangian. 

Examples.  1.  The  cotangent  fibration  is  lagrangian.  2.  The  Gauss  fibration 
from  the  space  of  lines  in  euclidean  space  to  the  unit  sphere  of  directions  is 
lagrangian. 
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All  lagrangian  fibrations  of  a  fixed  dimension  are  locally  (on  a  neighbor¬ 
hood  of  a  point  in  the  total  space)  symplectically  diffeomorphic. 

A  lagrangian  mapping  is  the  projection  of  a  lagrangian  submanifold  to  the 
base  of  a  lagrangian  fibration,  i.e.,  a  triple  V  —*■  E  —*■  B,  where  the  first  arrow 
is  an  immersion  onto  a  lagrangian  manifold  and  the  second  arrow  is  a 
lagrangian  fibration. 

Examples.  1.  A  gradient  mapping  q\-^dS/dq  is  lagrangian.  2.  The  normal 
mapping  which  maps  each  normal  vector  of  a  submanifold  in  euclidean  space 
to  its  tip  is  lagrangian.  3.  The  Gauss  mapping  which  takes  each  point  of  a 
transversely  oriented  hypersurface  in  euclidean  space  to  the  unit  vector  at 
the  origin  in  the  direction  of  the  normal  is  lagrangian,  (The  corresponding 
lagrangian  manifold  consists  of  the  normals  themselves.) 

An  equivalence  of  lagrangian  mappings  is  a  fibre-preserving  symplectic 
diffeomorphism  of  the  total  spaces  of  the  fibrations  which  takes  the  first 
lagrangian  manifold  to  the  second. 

The  set  of  critical  values  of  a  lagrangian  mapping  is  called  a  caustic.  The 
caustics  of  equivalent  mappings  are  diffeomorphic. 

Example.  The  caustic  of  the  normal  mapping  of  a  surface  is  the  envelope  of 
the  family  of  normals,  i.e.,  the  focal  surface  (surface  of  centers  of  curvature). 

Every  lagrangian  mapping  is  locally  equivalent  to  a  gradient  (or  normal, 
or  Gauss)  mapping.  The  singularities  of  generic  gradient  (or  normal,  or  Gauss) 
mappings  are  the  same  as  those  for  arbitrary  generic  lagrangian  mappings. 
The  simplest  of  these  are  classified  by  the  reflection  groups  Ak,  Dk,  E6,  En,  Es 
(see  Appendix  12). 

Example.  We  consider  a  medium  of  dust  particles  moving  inertially,  with 
their  initial  velocities  forming  a  potential  field.  After  time  t,  the  particle  at  x 
moves  to  x  4-  t(dS/dx).  We  thereby  obtain  a  one-parameter  family  of  smooth 
mappings  IR3  ->  K3. 

These  mappings  are  lagrangian.  In  fact,  a  potential  field  of  velocities  gives 
a  lagrangian  section  of  the  cotangent  bundle.  The  phase  flow  of  Newton  s 
equations  preserves  the  lagrangian  property.  For  large  t,  though,  our  lagrangian 
manifold  is  no  longer  a  section:  its  projection  on  the  base  develops  singular¬ 
ities.  The  caustics  of  the  corresponding  lagrangian  mappings  are  places  where 
the  density  of  particles  has  become  infinite.125  According  to  Ya.  B.  Zel  dovich 
(1970)  an  analogous  model  (taking  into  account  gravity  and  the  expansion  of 


125  The  relation  between  caustics  and  dust-like  media  was  first  discovered  by  Lifshitz,  Sudakov, 
and  Khalatnikov:  see  the  survey  by  E.  M.  Lifshitz  and  I.  M.  Khalatnikov,  Investigations  in 
relativistic  cosmology,  Adv.  Phys.  12  (1963),  185. 
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the  universe)  describes  the  formation  of  large  scale  nonhomogeneities  in  the 

distribution  of  matter  in  the  universe. 

According  to  the  theory  of  Lagrange  singularities,  the  newborn  caustics 
have  the  form  of  elliptic  saucers  (Figure  254)  (after  time  t  from  the  moment  of 
birth,  a  saucer  has  length  of  order  £1/2,  depth  of  order  t,  and  thickness  of  order 
t3/2).  The  birth  of  a  saucer  corresponds  to  A3.  The  metamorphoses  of  caustics 
which  occur  in  generic  one-parameter  families  of  lagrangian  mappings  are 
shown  in  Figure  255  (V.  I.  Arnold,  Wave  fronts  evolution  and  equivariant 
Morse  lemma.  Comm.  Pure  Appl.  Math.  6(1976),  319—335). 

Theorem  (1972).  The  germs  at  each  point  of  generic  lagrangian  mappings 
between  manifolds  of  dimension  <  5  are  simple  ( i.e .,  having  no  moduli)  and 
stable.  The  simple  stable  germs  of  lagrangian  mappings  are  classified  by  the 
reflection  groups  A,  T),  £,  in  a  way  which  will  be  explained  below. 

D  Contact  geometry  and  systems  of  rays  and  wave  fronts 

We  recall  that  a  contact  structure  on  an  odd  dimensional  smooth  manifold 
is  a  nondegenerate  field  of  tangent  hyperplanes.  The  specific  condition  of 
nondegeneracy  is  inessential  here,  since  near  generic  points,  all  generic  hy¬ 
perplane  fields  on  manifolds  of  a  fixed  odd  dimension  are  diffeomorphic 
(Darboux’s  theorem  for  contact  structures,  Appendix  4). 

Examples.  1.  The  manifold  of  contact  elements  of  a  smooth  manifold  consists 
of  all  its  tangent  hyperplanes.  The  rate  of  change  of  a  contact  element  belongs 
to  the  contact  structure  if  and  only  if  the  rate  of  change  of  the  point  of  contact 
(i.e.,  the  point  where  the  hyperplane  is  tangent  to  the  manifold)  belongs  to  the 
contact  element  itself.  2.  The  manifold  of  1  -jets  of  functions  y  =  f{x)  has  a 
contact  structure  dy  =  p  dx  (p  =  df/dx  for  the  1-jet  of  a  function  /). 

The  extrinsic  geometry  of  a  submanifold  of  contact  space  is  locally  deter¬ 
mined  by  the  intrinsic  geometry  (Givental’s  theorem  on  contact  structures). 

Integral  submanifolds  of  a  contact  structure  are  called  Legendre  (or 
legendrian)  submanifolds  if  they  have  the  largest  possible  dimension. 

Examples.  1.  The  set  of  all  contact  elements  tangent  to  a  fixed  submanifold 
(of  any  dimension)  is  a  Legendre  submanifold.  2.  In  particular,  all  contact 
elements  at  a  given  point  form  a  Legendre  submanifold  (a  fibre  of  the  bundle 
of  contact  elements).  3.  The  set  of  all  the  1-jets  of  a  single  function  is  a  Legendre 
submanifold  in  the  space  of  1-jets. 

A  fibration  is  called  a  Legendre  fibration  if  its  fibres  are  Legendre 
submanifolds. 

Examples.  1.  The  projective  cotangent  fibration  (attaching  each  contact  ele¬ 
ment  to  its  point  of  contact)  is  Legendre.  2.  Th e  fibration  of  1-jets  of  functions 
over  the  0-jets  (forgetting  the  derivative)  is  Legendre. 
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All  Legendre  fibrations  of  a  fixed  dimension  are  locally  contact  difleo- 
morphic  (in  a  neighborhood  of  a  point  in  the  total  space  of  the  fibration). 

The  projection  of  a  Legendre  submanifold  on  the  base  of  a  Legendre 
fibration  is  called  a  Legendre  mapping.  The  image  of  a  Legendre  mapping  is 
called  its  front. 

Examples.  1.  The  Legendre  transformation :  A  hypersurface  in  projective  space 
may  be  lifted  to  the  space  of  contact  elements  of  projective  space  as  a  Legendre 
submanifold.  The  manifold  of  contact  elements  of  projective  space  is  also 
fibred  over  the  dual  projective  space.  (The  fibration  assigns  to  each  contact 
element  the  plane  containing  it.)  This  is  a  Legendre  fibration.  The  projection 
of  the  lifted  Legendre  submanifold  maps  it  onto  the  hypersurface  which  is 
projectively  dual  to  the  original  one.  Thus,  the  projective  dual  of  a  smooth 
hypersurface  is  the  front  of  a  Legendre  mapping.  2.  Frontal  mappings:  Laying 
out  a  segment  of  length  t  on  each  normal  to  a  hypersurface  in  euclidean  space, 
we  obtain  a  Legendre  mapping  whose  front  is  equidistant  from  the  given 
hypersurface. 

Every  Legendre  mapping  is  locally  equivalent  to  a  Legendre  transforma¬ 
tion,  as  well  as  to  a  frontal  mapping.  The  theory  of  Legendre  singularities  thus 
coincides  exactly  with  the  theory  of  singularities  of  Legendre  transformations 
and  of  frontal  mappings.  Equivalence,  stability,  and  simplicity  of  Legendre 
mappings  are  defined  just  as  the  lagrangian  case. 

Theorem  (1973).  The  germs,  at  all  points,  of  generic  Legendre  mappings  between 
manifolds  of  dimension  <5  are  simple  and  stable.  The  simple  and  stable 
germs  of  Legendre  mappings  are  classified  by  the  groups  A,  D,  E:  their  fronts 
are  locally  dijfeomorphic  (in  the  complex  domain )  to  the  manifolds  of  non¬ 
regular  orbits  of  the  corresponding  reflection  groups. 

Example.  The  only  singularities  of  a  typical  wave  front  in  three-dimensional 
space  are  (semicubic)  cuspidal  curves  (A2)  and  “swallowtails”  (Ait  Figure  256; 
near  such  a  point,  the  front  is  diffeomorphic  to  the  surface  formed  by  the  poly¬ 
nomials  with  multiple  roots  in  the  space  of  polynomials  x4  +  ax2  +  bx  +  c). 
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Of  course,  there  may  also  be  transverse  intersections  of  branches  of  fronts  of 
the  types  just  described. 

Remark.  The  real  forms  of  simple  singularities  of  fronts  may  also  be  described 
in  terms  of  reflection  groups.  E.  Looijenga  has  shown  that  the  real  components 
in  the  complement  of  a  simple  germ  of  a  front  may  be  identified  with  the 
conjugacy  classes  of  involutions  (elements  of  order  2)  in  the  normalizer  of 
the  reflection  group,  conjugacy  being  taken  with  respect  to  the  reflection 
group  itself.  (See  E.  Looijenga,  The  discriminant  of  a  real  simple  singularity, 
Compositio  Math.  37  (1978),  51-62.) 

E  Applications  of  contact  geometry  to  symplectic  geometry 

All  lagrangian  singularities  may  be  obtained  from  Legendre  singularities,  if 
one  realizes  the  latter  by  projections  of  Legendre  submanifolds  of  the  space 
of  1-jets  of  functions  onto  the  space  of  0-jets.  If  one  forgets  the  value  of  each 
function,  the  space  of  1-jets  is  projected  onto  phase  space  (i.e.,  the  cotangent 
bundle);  a  Legendre  submanifold  in  the  first  space  projects  to  a  lagrangian 
submanifold  in  the  second.  In  particular,  the  caustic  of  a  lagrangian  mapping 
is  the  image  of  the  cuspidal  edge  of  the  front  of  a  Legendre  mapping  under  a 
projection  with  one-dimensional  fibres. 

Theorem  (O.  V.  Lyashko,  1979).  All  holomorphic  vector  fields  transverse  to  the 
front  of  a  simple  singularity  are  locally  equivalent  under  holomorphic  diffeo- 
morphisms  preserving  the  front. 

Example.  A  generic  vector  field  in  the  neighborhood  of  the  most  singular 
point  of  a  swallowtail  (x4  +  ax2  +  bx  +  c  =  (x  +  d)2 . . . }  is  equivalent,  by 
a  holomorphic  diffeomorphism  preserving  the  swallowtail,  to  the  normal  form 
d/dc  (Figure  257). 

The  reduction  of  various  objects  to  normal  form,  by  a  diffeomorphism 
preserving  a  wave  front  or  caustic,  is  a  basic  technique  for  studying  the 
geometry  of  systems  of  rays  and  fronts.  For  instance,  the  study  of  the  meta- 


Figure  257  The  normal  form  of  a  vector  field  at  the  swallowtail 
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morphoses  of  moving  wave  fronts  is  based  on  the  following  result,  which  is 
“dual”  to  the  previous  one. 

Theorem  (1976).  All  generic  holomorphic  functions  equal  to  zero  at  the  most 
singular  point  of  a  simple  singularity  of  a  front  are  locally  equivalent  under 
holomorphic  diffeomorphisms  which  preserve  the  front. 

Example.  In  a  neighborhood  of  the  most  singular  point  of  a  swallowtail,  a 
generic  function  may  be  reduced,  by  a  diflfeomorphism  preserving  the  swallow¬ 
tail,  to  the  normal  form  a. 

This  theorem  is  a  special  case  of  the  equivariant  Morse  lemma.  It  is  applied 
in  the  following  way.  The  instantaneous  wave  fronts  together  form  a  “large 
front”  in  space-time.  “Time”  is  a  function  on  space-time.  We  reduce  this 
function  to  normal  form  by  a  diffeomorphism  which  preserves  the  front,  and 
we  thereby  obtain  a  normal  form  for  the  metamorphoses  of  the  instantaneous 
fronts.  The  metamorphoses  of  fronts  in  R3  are  shown  in  Figure  258.  The 
problem  of  describing  the  metamorphoses  of  caustics  in  generic  one-parameter 
families  (Figure  255)  is  solved  in  exactly  the  same  way.  In  this  case,  the  time 
function  is  reduced  to  normal  form  by  a  transformation  of  space-time  which 
preserves  the  “large  caustic.”  If  the  dimension  of  space-time  is  no  larger  than 
4,  then  all  the  singularities  of  the  large  caustic  are  of  types  A  and  D. 

The  caustics  of  lagrangian  singularities  in  the  A  series  differ  from  the  wave 
fronts  in  the  A  series  only  by  a  shift  of  1  unit  in  the  index.  The  same  is  therefore 
true  for  their  metamorphoses. 

The  caustics  in  the  D  series  are  not  the  same  as  the  fronts.  The  normal  forms 
for  a  generic  time  function  in  the  neighborhood  of  a  caustic  singularity  of  type 
D  were  found  by  V.  M.  Zakalyukin  (1975).  The  topological  normal  forms  for 
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the  time  function  are  especially  simple: 


Caustic 

Real  case 

Complex  case 

d4 

A  i  +  k2 

k  j  +  k2 

d: 

A  j  +  k2,  Aj  +  k4 

A  j  -f-  A-2 

Dik+i 

+  ^1 

h 

D2k,  k  >  3 

2,  ±  k2 

kx  +  k2 

Here,  the  large  caustic  D ^  is  the  set  of  k  for  which  Jr(-,  k)  has  a  degenerate 
critical  point,  where 


#"( x ,  k)  -  ±x\x±  + 


-2 


A* 


■*1 


+ 


+  K- 


2*1  +  2A„x2  (p  >  4). 


The  reduction  to  normal  form  of  the  germ  of  the  time  function  is  accom¬ 
plished  by  a  local  homeomorphism  of  the  space  IR"-1  (C"-1),  which  preserves 
the  large  caustic  and  which  is  smooth  everywhere  except  at  0  (V.  I.  Bakhtin, 
1984). 

J.  Nye  (1984)  has  noticed  that  not  all  metamorphoses  of  caustics  and  fronts 
may  be  realized  by  the  motion  of  a  front  under  an  equation  of  eikonal  (or 
Hamilton-Jacobi)  type.  For  example,  the  caustic  of  a  ray  system  cannot  have 
the  form  of  “lips”  with  two  cusps  (although  this  is  possible  for  lagrangian 
caustics).  The  point  is  that  the  inclusion  of  a  lagrangian  or  Legendre  manifold 
in  the  hypersurface  given  by  a  Hamilton-Jacobi  or  eikonal  equation  imposes 
topological  restrictions  on  the  coexistence,  and  thus  on  the  metamorphoses, 
of  singularities,  even  though  the  individual  singularities  may  be  realized  on 
hypersurfaces.  This  is  namely  the  case  when  the  level  surface  of  the  hamiltonian 
is  locally  nondegenerately  convex  in  the  momentum  variables. 

The  vector  fields  generating  the  diffeomorphisms  preserving  a  front  are 
those  which  are  tangent  to  it.  The  study  of  these  vector  fields  leads  to  an 
unusual  “convolution”  operation  on  the  invariants  of  a  reflection  group. 
To  a  pair  of  invariants  (functions  on  the  orbit  space)  we  associate  a  new 
invariant— the  scalar  product  of  the  gradients  of  the  functions  (pulled  back 
from  the  orbit  space  to  the  original  euclidean  space). 

The  linearization  of  this  operation  defines  a  symmetric  bilinear  mapping 
from  each  cotangent  space  of  the  orbit  space  into  itself. 


Theorem  (1979).  The  linearized  convolution  of  invariants  of  a  reflection  group 
is  isomorphic  as  a  bilinear  operation  to  the  operation  on  the  local  algebra  of 
the  corresponding  singularity  given  by  the  formula  (p,  q)t—*S(p-q),  where 
S  —  D  +  (2/h)E,  D  is  Euler's  quasi-homogeneous  derivation,  and  h  is  the 
Coxeter  number. 


In  1981,  A.  N.  Varchenko  and  A.  B.  Givental’ (who  also  proved  the  theorem 
above  for  the  exceptional  groups)  found  a  far-reaching  generalization  of  this 
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result.  They  replaced  the  euclidean  structure  by  the  intersection  form  of 
the  underlying  period  mapping,  which  arises  from  a  family  of  holomorphic 
differential  forms  on  the  fibres  of  the  Milnor  fibration  of  a  versal  family  of 
functions.  A  nondegenerate  intersection  form  defines  (depending  on  the  parity 
of  the  number  of  variables)  either  a  locally  flat  pseudo-euclidean  metric  with 
a  standard  singularity  on  the  Legendre  front  or  a  symplectic  structure  which 
extends  holomorphically  to  the  front. 

Example.  The  space  of  monic  polynomials  with  odd  degree  and  sum  of  the 
roots  equal  to  zero  acquires  yet  another  symplectic  structure.  Relative  to  this 
structure,  the  submanifold  of  polynomials  with  the  maximal  number  of  double 
roots  turns  out  to  be  lagrangian. 

When  the  intersection  form  is  indefinite,  the  symplectic  structure  is  replaced 
by  a  Poisson  structure  (see  Appendix  14). 

F  Tangential  singularities 

The  first  applications  of  the  theory  of  lagrangian  and  Legendre  singularities, 
around  which  the  theory  itself  developed  (~1966),  concerned  short  wave 
asymptotics  in  the  form  of  the  asymptotics  of  oscillatory  integrals.  A  survey 
of  these  applications  (including  the  determination  of  uniform  estimates  for 
oscillatory  integrals  when  saddle  points  meet,  the  calculation  of  asymptotics 
using  Newton  polyhedra,  the  construction  of  mixed  Hodge  structures,  appli¬ 
cations  to  number  theory  and  the  theory  of  convex  polyhedra,  and  estimates 
of  the  index  of  singular  points  of  vector  fields  and  the  number  of  singular 
points  of  algebraic  surfaces)  may  be  found  in  the  book: 

V.  I.  Arnold,  A.  N.  Varchenko,  and  S.  M.  Gusein-Zade,  Singularities  of 
Differentiable  Mappings ,  Vol.  II,  Monodromy  and  Asymptotics  of  Integrals, 
Moscow,  Nauka,  1984.  English  translation:  Birkhauser,  1988. 

and  in  the  paper 

V.  I.  Arnold,  Singularities  of  ray  systems,  Proceedings  of  the  International 
Congress  of  Mathematicians,  August  16-24, 1983,  Warsaw. 

Here  we  shall  present  other  applications  of  the  theory  of  lagrangian  and 
Legendre  singularities  to  the  study  of  the  configurations  of  projective  mani¬ 
folds  and  tangential  planes  of  various  dimensions.  One  is  led  to  such  problems 
from  variational  problems  with  one-sided  constraints  (such  as  the  obstacle 
problem),  as  well  as  from  the  study  of  Nekhoroshev’s  exponent  of  roughness 
for  unperturbed  hamiltonian  functions  (see  Appendix  8). 

We  consider  a  generic  surface  in  three-dimensional  projective  space  (Figure 
259).  The  curve  of  parabolic  points  (p)  divides  the  surface  into  a  domain  of 
elliptic  points  (e)  and  a  domain  of  hyperbolic  points  (h);  the  latter  domain 
contains  the  curve  of  inflection  points  of  the  asymptotic  lines  (/),  with  its 
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Figure  259  Projective  classification  of  points  of  a  surface 


points  of  biinflection  (6),  self-intersection  (c),  and  tangency  to  the  parabolic 
curve  (f). 

From  this  classification  of  points,  one  may  derive  both  estimates  of  curva¬ 
ture  exponents  and  the  following  classification  of  projections. 

Theorem  (O.  A.  Platonova  and  O.  P.  Shcherbak,  1981).  Every  projection  from 
a  point  outside  a  generic  surface  in  IRP3  is  locally  equivalent  at  each  point  of 
the  surface  to  the  projection  along  lines  parallel  to  the  x-axis  of  a  surface 
z  —  f(x,  y),  where  f  is  one  of  the  following  14  functions: 

x,  x2,  X3  +  xy,  x3  ±  xy2,  x3  +  xy3,  x4  +  xy, 
x4  +  x2y  +  xy2,  x5  ±  x3y  +  xy,  x3  ±  xy4,  x4  -i-  x2y  4-  xy3,  x5  +  xy. 

By  a  projection  we  mean  here  a  diagram  V  — *■  E  — *■  B  consisting  of  an 
embedding  and  a  fibration;  an  equivalence  of  projections  is  then  a  3  x  2 
commutative  diagram  whose  vertical  arrows  are  dififeomorphisms. 

The  only  singularities  of  the  projection  from  a  generic  center  are  folds  and 
Whitney  tucks.  The  tucks  appear  when  the  projection  is  along  an  asymptotic 
direction.  The  remaining  singularities  are  visible  only  from  special  points.  The 
finiteness  of  the  number  of  singularities  of  projections  (and  therefore  the  number 
of  singularities  of  apparent  contours)  was  not  obvious  before  the  result  above 
was  obtained,  since  there  is  a  continuum  of  inequivalent  singularities  for 
generic  three-parameter  families  of  mappings  from  a  surface  to  the  plane. 

The  regions  of  space  from  which  the  generic  surface  has  a  different  appear¬ 
ance,  as  well  as  the  corresponding  views  of  the  surface,  are  shown  in  Figure  260 
(for  the  most  complicated  cases). 

The  hierarchy  of  tangential  singularities  becomes  more  comprehensible 
when  it  is  reformulated  in  terms  of  symplectic  and  contact  geometry.  R. 
Melrose  (1976)  observed  that  the  rays  tangent  to  a  surface  are  described  by 
a  pair  of  hypersurfaces  in  symplectic  phase  space:  one  of  them,  p2  =  1,  is 
defined  by  the  metric;  the  other  is  defined  by  the  surface. 

A  significant  part  of  the  geometry  of  asymptotic  lines  may  be  reformulated 
in  terms  of  this  pair  of  hypersurfaces.  In  this  way,  we  may  transfer  concepts 
from  the  geometry  of  surfaces  to  the  more  general  case  of  arbitrary  pairs  of 
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hypersurfaces  in  symplectic  space,  and  thereby  use  the  geometric  intuition 
gained  from  surface  theory  to  study  general  variations  problems  with  one* 
sided  phase  constraints. 

Let  Y  and  Z  be  hypersurfaces  in  the  symplectic  space  X  which  intersect 
transversely  along  a  submanifold  W.  Projecting  Y  and  Z  onto  their  manifolds 
of  characteristics,  we  obtain  the  hexagonal  diagram 


in  which  E  is  the  common  manifold  of  critical  points  for  the  projections  of  W 
on  U  and  V. 

Example.  Let  X  be  the  {q,  p}  phase  space  for  a  free  particle  in  euclidean  space 
{q  is  the  position  of  the  particle,  p  its  momentum).  Y  is  the  manifold  of  unit 
vectors  (p2  =  1).  Z  is  the  manifold  of  vectors  at  the  boundary  (q  belongs  to 
a  hypersurface  F).  Then  U  is  the  manifold  of  rays,  V  is  the  tangent  bundle  of 
the  boundary  T,  W  is  the  manifold  of  unit  vectors  at  the  boundary,  and  E  is 
the  unit  tangent  bundle  of  the  boundary. 

If  a  unit  tangent  vector  to  the  boundary  is  not  asymptotic,  then  both  of  the 
projections  W  ->  U  and  W  -*V  have  fold  singularities  at  this  point.  Each  of 
them  defines  an  involution  on  W  which  fixes  E. 

Example.  There  are  two  involutions,  a  and  t,  on  the  manifold  of  tangent 
vectors  along  a  convex  plane  curve  W  (Figure  261).  Their  product  is  BirkhofT  s 
billiard  mapping  (1927). 

Using  pairs  of  involutions,  Melrose  found  a  local  normal  form  for  pairs  of 
hypersurfaces  in  symplectic  space  which  are  in  the  situation  just  described. 
(This  was  for  the  C00  case;  in  the  analytic  case,  one  usually  obtains  divergent 


Figure  261  The  two  involutions  generating  the  billiard  mapping 
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series,  just  as  in  the  theory  of  Ecalle  (1975)  and  Voronin  (1981)  on  resonant 
dynamical  systems.) 

For  more  complicated  singularities  (for  example,  near  asymptotic  direc¬ 
tions),  pairs  of  hypersurfaces  have  moduli.  For  the  two  simplest  singularity 
types  after  the  fold,  it  is  possible  to  put  in  normal  form  (at  least  for¬ 
mally)  the  pair  consisting  of  the  first  hypersurface  and  its  intersection  with 
the  second.  This  allows  us  to  study,  in  a  neighborhood  of  an  asymptotic  or 
biasymptotic  unit  tangent  vector  to  the  boundary,  the  mapping  which  assigns 
the  ray  containing  it  to  each  unit  vector  at  the  boundary.  The  critical  values 
of  this  mapping  in  the  symplectic  space  of  lines  are  described  by  the  following 
result,  since  the  manifold  of  tangent  rays  is  locally  diffeomorphic  near  a 
biasymptotic  ray  to  the  product  of  a  swallowtail  and  a  line. 

Theorem  (1981).  All  the  generic  symplectic  structures  in  the  neighborhood  of  a 

point  in  the  direct  product  of  a  swallowtail  and  a  linear  space  are  formally 

diffeomorphic  by  local  diffeomorphisms  preserving  the  product  structure. 

G  The  obstacle  problem 

We  consider  an  obstacle  bounded  by  a  smooth  surface  in  euclidean  space. 
The  obstacle  problem  consists  of  the  study  of  the  singularities  of  the  function 
defined  outside  the  obstacle  whose  value  at  each  point  is  the  length  of  the 
shortest  path  remaining  outside  the  obstacle  and  joining  the  point  to  a  fixed 
initial  set.  This  variational  problem  on  a  manifold  with  boundary  is  unsolved 
even  in  three-dimensional  space. 

Each  minimizing  path  consists  of  segments  of  straight  lines  and  segments 
of  geodesics  on  the  surface  of  the  obstacle  (Figure  262).  We  consider  therefore 
a  system  of  geodesics  on  the  surface  of  the  obstacle,  orthogonal  to  a  fixed  front. 
The  system  of  all  rays  tangent  to  these  geodesics  forms  a  lagrangian  variety 
in  the  symplectic  space  of  lines,  just  as  any  system  of  extremals  for  a  varia¬ 
tional  problem.  But  while  in  an  ordinary  variational  problem  this  lagrangian 
variety  is  a  smooth  manifold  (even  at  caustics),  the  lagrangian  variety  arising 
in  the  obstacle  problem  has  singularities.  From  the  last  theorem  (in  the 
previous  section),  one  obtains: 
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Figure  263  The  open  (“unfurled”)  swallowtail 


Corollary  (1981).  The  lagrangian  variety  of  rays  in  a  generic  obstacle  problem 
has  a  semicubic  cuspidal  edge  along  each  asymptotic  ray  and  a  singularity 
dijfeomorphic  to  an  open  swallowtail  at  each  biasymptotic  ray. 

The  open  swallowtail  is  the  surface  in  the  four-dimensional  space  of  monic 
polynomials  x5  +  Ax 3  +  Bx 2  +  Cx  +  D  formed  by  the  polynomials  with 
triple  roots.  Differentiation  of  the  polynomials  turns  the  open  swallowtail  into 
an  ordinary  one;  when  the  swallowtail  is  opened,  the  cuspidal  edge  is  retained, 
but  the  self-intersection  disappears  (Figure  263). 

Theorem  (1981).  In  the  generic  motion  of  a  wave  front,  the  cuspidal  edges  of 
the  instantaneous  fronts  sweep  out  an  open  swallowtail  in  four-dimensional 
space-time  (over  the  usual  swallowtail  caustic). 

Theorem  (O.  P.  Shcherbak,  1982).  Consider  a  generic  one-parameter  family  of 
space  curves  and  suppose  that ,  for  some  value  of  the  parameter  (time),  one  of 
the  curves  has  a  point  of  double  flatness  (of  type  1,  2,  5).  Then  the  projective 
duals  of  these  curves  form  a  surface  in  space-time  which  is  locally  diffeo- 
morphic  to  the  open  swallowtail. 

The  open  swallowtail  is  the  first  member  of  a  whole  series  of  singularities. 
Consider,  in  the  space  of  monic  polynomials  x"  +  2,  x"  1  +  •  •  •  +  A„_i ,  the  set 
of  polynomials  with  a  root  of  fixed  comultiplicity  k,  (x  —  a)"  k(xfc  +  •  ••). 
Differentiation  of  polynomials  preserves  the  comultiplicity  of  roots. 

Theorem  (A.  B.  Givental’,  1981).  The  sequence  of  sets  of  polynomials  of  fixed 
comultiplicity  becomes  stabilized  as  the  degree  grows,  beginning  with  degree 
n  —  2k  +  1  ( i.e .,  when  the  self-intersections  are  eliminated). 

Example.  The  open  swallowtail  is  the  first  stable  variety  over  the  ordinary 
swallowtail. 

The  appearance  of  swallowtails  in  the  obstacle  problem  was  axiomatized 
by  Givental’  (1982)  in  his  theory  of  triads. 
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Definition.  A  symplectic  triad  ( H ,  L,  /)  consists  of  a  smooth  hypersurface  H  in 
a  symplectic  manifold  and  a  lagrangian  submanifold  L  which  is  tangent  to  H 
to  first  order  along  a  hypersurface  /  of  L. 

The  lagrangian  variety  generated  by  the  triad  is  the  image  of  L  in  the 
manifold  of  characteristics  of  the  hypersurface  H. 

Example  1.  Consider,  in  the  problem  of  bypassing  an  obstacle  with  boundary 
T  cr  [Rn,  the  distance  along  geodesics  from  an  initial  front  as  a  function 
s:  T  -*  IR.  The  manifold  L  consisting  of  all  extensions  of  the  1-form  ds  from  T 
to  IR",  together  with  the  hypersurface  H:  p2  =  1,  forms  a  triad.  The  lagrangian 
variety  generated  by  this  triad  is  precisely  the  variety  of  rays  tangent  to  the 
geodesics  in  our  system  of  extremals  on  T. 

Example  2.  In  the  symplectic  manifold  of  monic  polynomials  ,¥  =  xd  + 
/.1x'i_1  +  •  •  •  +  /d  with  even  degree  d  —  2m,  the  polynomials  divisible  by  xm 
form  a  lagrangian  submanifold  L. 

Consider  the  hamiltonian  for  translation  along  the  x-axis.  [This  polynomial 
in  /  is  equal  to 

h  =  £  ( -  i+j  =  d,  JF(i)  =  d^jdxK 

The  hypersurface  h  =  0  is  tangent  to  the  lagrangian  submanifold  L  along 
the  subspace  /  of  polynomials  divisible  by  xm+1,  thus  forming  a  triad.  The 
lagrangian  variety  generated  by  this  triad  is  an  open  swallowtail  of  dimension 
m  —  1  (the  set  of  polynomials  xd_1  +  alxd~ 3  +  ■■•  +  ad_ 2  having  a  root  of 
multiplicity  greater  than  half  the  degree).] 

Theorem  (A.  B.  Givental’,  1982).  The  triads  in  Example  2  are  stable.  Every  germ 
of  a  generic  triad  is  diffeomorphic  to  a  germ  of  a  triad  in  Example  2. 

Corollary.  The  variety  of  rays  tangent  to  the  geodesics  in  the  system  of  extremals 
of  a  generic  obstacle  problem  is  locally  symplectically  diffeomorphic  to  a 
lagrangian  open  swallowtail. 

In  contact  geometry,  there  are  two  kinds  of  Legendre  varieties  associated 
to  obstacle  problems:  varieties  of  contact  elements  of  fronts  and  varieties 
of  1  -jets  of  time  functions.  The  first  of  these  are  diffeomorphic  to  lagrangian 
open  swallowtails;  the  second  are  diffeomorphic  to  cylinders  over  the 
first. 

Example.  Consider  the  problem  of  bypassing  an  obstacle  in  the  plane  which 
is  bounded  by  a  curve  with  an  inflection  point.  The  fronts,  which  are  the 
evolvents  of  the  curve,  have  two  kinds  of  singularities:  ordinary  cusps  (of  order 
3/2)  on  the  curve  itself  and  singularities  of  order  5/2  on  the  tangent  line 
through  the  inflection  point  (Figure  264).  Over  points  of  the  boundary  curve. 
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the  Legendre  variety  is  nonsingular,  while  over  points  on  the  tangent  line 
through  the  inflection  point  it  has  a  cuspidal  edge  of  order  3/2. 

Theorem  (1978).  In  the  space  of  contact  elements  to  the  plane,  fibred  over  the 
plane  itself,  the  surface  consisting  of  the  contact  elements  of  the  evolvents  of 
a  generic  curve  near  a  point  of  inflection  is  locally  equivalent  by  a  fibre¬ 
preserving  dififeomorphism  to  the  surface  consisting  of  all  polynomials  with 
multiple  roots  in  the  space  of  polynomials  x3  +  axz  +  bx  +  c,  fibred  into 
lines  parallel  to  the  b-axis. 

This  surface  (Figure  265),  together  with  the  surface  c  =  0  representing  the 
contact  elements  along  the  boundary  curve,  forms  a  variety  which  is  diffeo- 
morphic  to  the  set  of  irregular  orbits  for  the  reflection  group  B3.  This  observa¬ 
tion  led  to  the  theory  of  boundary  singularities  (1978). 

Example  (I.  G.  Shcherbak,  1982).  Consider  a  generic  curve  on  a  surface  in 
three-dimensional  euclidean  space.  At  certain  points,  the  direction  of  the  curve 
coincides  with  principal  curvature  directions  of  the  surface.  It  follows  from 
the  theory  of  lagrangian  boundary  singularities  that  the  Weyl  group  F4  is 


Figure  265  The  surface  of  contact  elements  of  the  evolvents 
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Figure  266  The  caustic’s  singularity  F4 


connected  with  each  such  point:  the  focal  points  of  the  surface  ( A2 ),  focal 
points  of  the  curve  (A'2),  and  normals  to  the  surface  at  points  of  the  curve  ( B2 ) 
together  form  an  F4  caustic  near  the  center  of  curvature  (Figure  266). 

We  will  not  dwell  here  on  the  theory  of  boundary  singularities,  but  it  is 
worth  mentioning  the  “Lagrange  duality”  relating  a  function  and  its  restric¬ 
tion  to  the  boundary  (up  to  stable  equivalence):  this  may  be  thought  of  as 
a  modern  version  of  the  Lagrange  multiplier  rule  (I.  G.  Shcherbak,  1982). 

Returning  to  inflection  points  of  plane  curves,  we  consider  the  graph  of  the 
multiple-valued  time  function  in  an  obstacle  problem.  The  level  curves  of  this 
function  are  the  evolvents  of  the  obstacle  boundary.  Therefore,  the  graph  of 
this  function  has  the  form  (shown  in  Figure  267)  of  a  surface  with  two  cuspidal 
edges  (of  orders  3/2  and  5/2).  When  I  showed  this  surface  to  A.  B,  Givental’, 
he  recognized  O.  V.  Lyashko’s  drawing  of  the  singular  orbit  E  of  the  group 
H3  (symmetries  of  the  icosahedron).  Givental’s  conjecture  was  soon  verified: 


Figure  267  The  discriminant  of  H3 
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Theorem  (O.  P.  Shcherbak,  1982).  The  graph  of  the  ( multiple-valued )  time 
function  in  the  problem  of  bypassing  an  obstacle  bounded  by  a  generic  plane 
curve  is  formally  diffeomorphic  near  an  inflection  point  of  the  curve  to  the 
variety  2. 

The  proof  of  this  theorem  uses: 

Theorem  (O.  V.  Lyashko,  1981).  The  variety  2  is  diffeomorphic  to  the  variety 
of  polynomials  x5  +  ax4  +  bx2  +  c  having  a  multiple  root. 

Lyashko’s  theorem  describes  the  variety  of  singular  orbits  for  the  group  H3 
as  the  union  of  the  tangents  to  the  curve  ( t ,  f3,  t5),  while  Shcherbak’s  theorem 
applies  to  any  curve  of  the  form  ( t  +  o(t ),  t3  +  o(t3),  t5  +  o(f5)). 

The  same  singularity  appears  on  a  generic  front  at  the  point  of  tangency  of 
a  asymptotic  ray  with  the  bounding  surface  of  an  obstacle  in  R3. 

Finally,  we  describe  a  variational  problem  leading  to  the  singularity  B4 
(after  O.  P.  Shcherbak). 

The  group  H4  consists  of  the  symmetries  of  a  regular  polyhedron  in  R4.  Its 
120  vertices  lie  on  S3  %  SU(2)  and  form  the  binary  icosahedral  group  (the 
binary  group  being  the  inverse  image  of  the  symmetry  group  of  the  icosahedron 
under  the  double  covering  S3  -*■  SO{3)). 

Consider  the  problem  of  bypassing  an  obstacle  bounded  by  a  smooth 
surface  in  three-dimensional  euclidean  space.  The  extremals  beginning  at  a 
fixed  point  outside  the  obstacle  generate  a  pencil  (one-parameter  family)  of 
geodesics  on  the  surface.  A  time  function  is  the  distance  from  a  fixed  initial 
manifold  (e.g.,  a  point)  along  stationary  (not  necessarily  minimizing)  paths 
consisting  of  arcs  of  geodesics  and  their  tangents,  considered  as  a  (multiple¬ 
valued)  function  of  the  terminal  point  in  space  (solution  of  the  Hamilton- Jacobi 
equation). 

Theorem  (O.  P.  Shcherbak,  1984).  For  a  generic  obstacle,  the  graph  of  the  time 
function  at  a  point  which  is  focal  for  the  pencil  along  an  asymptotic  tangent 
at  a  parabolic  point  of  the  surface  is  locally  diffeomorphic  to  the  variety  2  of 
singular  orbits  of  the  group  HA. 

An  explicit  parametrization  ofL  is: 

(a,  b2j2  +  ac,  c2/ 2  +  ab3,  b5/ 5  +  c3/ 3  +  ab3c). 

The  group  H4  is  related  to  a  four-dimensional  subspace  of  the  base  space 
of  the  versal  deformation  of  Es  (this  connection  is  explained  in  Remark  7,  §9 
of  the  paper  by  V.  I.  Arnold,  Indices  of  singular  points  of  1 -forms  on  mani¬ 
folds  with  boundary,  convolution  of  invariants  of  reflection  groups,  and 
singular  projections  of  smooth  surfaces,  Russian  Math.  Surveys  34:2  (1979), 
1-42). 
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Corresponding  to  this  four-dimensional  subspace,  there  is  an  embedding 
of  the  local  algebra  At  into  the  local  algebra  Es,  which  induces  on  the  former 
the  same  grading  which  is  given  by  the  convolution  of  invariants  of  H 4. 
O.  P.  Shcherbak  has  shown  that  this  relationship  establishes  yet  another 
description  of  the  variety  of  singular  orbits  of  //4: 

Theorem.  Consider  those  values  of  2  for  which  the  curve  x5  +  y3  +  + 

22x3  +  23y  +  24  =  0  is  singular.  One  of  the  irreducible  components  of  this 
three-dimensional  hypersurface  in  /.-space  is  diffeomorphic  to  the  variety  of 
singular  orbits  of  the  group  H 4. 

The  caustic  and  three  typical  sections  of  the  variety  of  singular  orbits  of 
are  shown  in  Figures  268  and  269.  See  O.  P.  Shcherbak,  Wavefronts  and 
reflection  groups,  Russian  Math.  Surveys,  43  (1988). 
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